Resights AVM
收藏Snowflake2024-10-03 更新2024-10-04 收录
下载链接:
https://app.snowflake.com/marketplace/listing/GZSYZP5GJV
下载链接
链接失效反馈官方服务:
资源简介:
# Resights Hackathon
<p><br/></p>
Contains dataset for free public exploration for interesting AI/ML uses-cases regarding Danish publicly available data.
<p><br/></p>
Data has been processed and prepared by Resights, and is free to use! Flex your AI/ML skills and tell us about your results.
<p><br/></p>
Currently there is only 1 dataset available which covers trades on condominiums (Ejerlejligheder) in all of Denmark covering the period from 1st of January 2015 to 30th of June 2024.
More dataset might come covering various different real estate types or something on completely other topics that could be interesting to work with.
<p><br/></p>
Please contact us at mikkel@resights.dk if you encounter any questions or bugs.
<p><br/></p>
## AVM Ejerlejligheder
Consist of the following two datasets:
<p><br/></p>
**HACKATHON.AVM_EJERLEJLIGHEDER_TRAIN**
Used to train your AI/ML model on, which contains both price and SQM-price of the property (target values).
<p><br/></p>
**HACKATHON.AVM_EJERLEJLIGHEDER_TEST**
Same as the `TRAIN` dataset, however, where `PRICE` and `SQMPRICE` is set to `null`.
<p><br/></p>
**Attributes**
- `TRANSACTION_ID`: Unique identifier of the trade.
- `BUILDING_ID`: Unique identifier of the building that the condominium is located within. May be used to relate trades within the same building as an anchor for price.
- `UNIT_ID`: Unique identifier of the unit, can be used to trace unit that have been sold multiple times within the timeframe.
- `FLOOR`: The floor that the condominium is located on. Typically an integer, but can also be `st` or `kl`.
- `MUNICIPALITY_CODE`: Code of the municipality that the condominium is located in. See https://danmarksadresser.dk/adressedata/kodelister/kommunekodeliste
- `ZIP_CODE`: The ZipCode of the city that the condominium is located in.
- `STREET_CODE`: Unique identifier of the street that the condominium is located on. A StreetCode is only unique within a municipality why it's prefixed with the MunicipalityCode.
- `TRADE_DATE`: The date that the trade happened on, typically representing the purchase agreement date.
- `PRICE`: The price the condominium was traded at.
- `SQM_PRICE`: The SQMPrice the condominium was traded at. Equivalent to `Price` / `AreaResidential`
- `CONSTRUCTION_YEAR`: The year the building that the condominium is located within was constructed.
- `REBUILDING_YEAR`: The year the building that the condominium is located within was last last rebuilt.
- `AREA_TINGLYST`: Tinglyst area (area of interior apartment).
- `AREA_RESIDENTIAL`: Residential area which is measured from outside walls and contains common access area. Typically larger than the Tinglyst area.
- `AREA_OTHER`: Indicates the unit's secondary area, e.g., basement or attic, that is registered on the condominium deed.
- `AREA_COMMON_ACCESS_SHARE`: Indicates the area of the unit's share of the building's access area.
- `AREA_CLOSED_COVER_OUTHOUSE`: Indicates the unit's area of open coverage. Is not included in the unit's total area.
- `AREA_OPEN_BALCONY_ROOFTOP`: Indicates the unit's area of closed coverage. Is not included in the unit's total area.
- `NUMBER_ROOMS`: Number of rooms in the condominium.
- `FACILITIES_TOILET`: Toilet facilities, see column for values.
- `FACILITIES_SHOWER`: Shower facilities, see column for values.
- `FACILITIES_KITCHEN`: Kitchen facilities, see column for values.
- `HAS_ELEVATOR`: Whether the building that condominium is located in has an elevator.
- `LNG`: Longitude of the location of the condominium
- `LAT`: Latitude of the location of the condominium
- `DISTANCE_LAKE`: Direct distance to the nearest lake
- `DISTANCE_HARBOUR`: Direct distance to the nearest harbour
- `DISTANCE_COAST`: Direct distance to the nearest coast
<p><br/></p>
**Notes**
- Data generally contains all trades of condominiums covering the period from 1st of January 2015 till 30th of June 2024, whereas the `TEST` dataset contains trades after this period.
- Some trades might have been removed due to potential errors in the publicly available dataset
- Trades only covers ordinary free trades and not e.g. family trades
- Distances are calculated as direct distances, thus a coast might not be visible from the condominium even though it's close by.
- Trades containing multiple properties have been removed.
- Trades with more than one unit have been removed.
- Trades below 500.000 DKK and above 100.000.000 have been removed
- Trades below 1.000 DKK and above 250.000 DKK SQM-price have been removed
- Trades are as much as possible reflecting the attributes at the date of trade. If the property has changed significantly, it might have been removed.
- `Transaction_ID`, `Building_ID` and `Unit_ID` have been replaced with deterministic UUIDs without any information being lost to maintain the focus as a dataset solely for AVM purposes, and avoid that participants associate the original trade prices from public data for evaluation purposes.
<p><br/></p>
**Considerations**
- Trades are historic, and the usual trend in the market is that prices are increasing over time. Thus a trade in 2015 probably does not reflect the price for a property with the same characteristics in 2025.
<p><br/></p>
<p><br/></p>
## **GET STARTED**
Load to Pandas DataFrame with Python & Snowflake:
```javascript
import snowflake.snowpark as snowpark
def main(session: snowpark.Session):
df_train = session.table('AVM_EJERLEJLIGHEDER_TRAIN')
df_test = session.table('AVM_EJERLEJLIGHEDER_TEST')
df_train.show()
return df_train
```
<p><br/></p>
提供机构:
Resights ApS
创建时间:
2024-09-29



