Big Banyan Tree

community

AI & ML interests

None defined yet.

Recent Activity

GRMenon  updated a Space 3 months ago
big-banyan-tree/README
asquirous  updated a dataset 3 months ago
big-banyan-tree/BBT_CommonCrawl_2018
asquirous  updated a dataset 3 months ago
big-banyan-tree/BBT_CommonCrawl_2019
View all activity

BigBanyanTree is an initiative to empower engineering colleges to set up their data engineering clusters and drive interest in data processing and analysis using tools such as Apache Spark.

As part of that initiative, we have open-sourced datasets processed from CommonCrawl data.

The datasets offer two subsets having the specified columns:
script_extraction: ["ip", "host", "server", "script_src_attrs", "year"]
ipmaxmind: ["ip", "host", "server", "postal_code", "latitude", "longitude", "accuracy_radius", "continent_code", "continent_name", "country_iso_code", "subdivision_code", "city_name", "metro_code", "time_zone", "year"]

models

None public yet