About This Dataset
This dataset contains electricity consumption data over multiple dates for a set of users in a smart grid. Each user is identified by a unique ID, and data is collected daily over a period spanning from January 1, 2014, to October 31, 2016. This dataset can be valuable for analyzing consumption patterns and detecting irregular usage, which may indicate potential electricity theft.
Data Columns
- UserId: Unique identifier for each user.
- IsStealer: Binary label where
1
indicates a user suspected of theft and 0
indicates a regular user.
- Daily Consumption Columns: Each date from
1/1/2014
to 10/31/2016
has a corresponding column that shows the amount of electricity consumed (in kWh) by the user on that day.
Key Insights and Applications
- Consumption Patterns: Researchers can analyze consumption trends and detect unusual patterns across days, weeks, and months.
- Anomaly Detection: The dataset enables the use of machine learning and anomaly detection techniques to identify suspicious behavior.
- Federated Learning Potential: The labeled data offers a unique opportunity to develop theft detection schemes in a federated learning setup, which is decentralized and privacy-focused.
Possible Analyses
- Time Series Analysis: Observe how electricity usage fluctuates over time for individual users or across the entire dataset.
- User Segmentation: Group users based on consumption habits or theft likelihood.
- Theft Detection: Utilize the
IsStealer
label to train machine learning models for identifying potential theft, improving grid security.
Usage
This dataset is suitable for experiments in time series analysis, anomaly detection, and federated learning, particularly in the context of smart grids. By leveraging this dataset, researchers and data scientists can explore new approaches to improving the integrity and efficiency of smart grid systems.