Data Science
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge from structured and unstructured data. Data scientists perform data analysis and preparation, and their findings inform high-level decisions in many organizations.
Here are 247 public repositories matching this topic...
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
-
Updated
Apr 8, 2026 - C++
Matplot++: A C++ Graphics Library for Data Visualization 📊🗾
-
Updated
Apr 2, 2026 - C++
🔨 🍇 💻 🚀 GraphScope: A One-Stop Large-Scale Graph Computing System from Alibaba | 一站式图计算系统
-
Updated
Apr 2, 2026 - C++
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.
-
Updated
Aug 28, 2023 - C++
Shōgun
-
Updated
Dec 19, 2023 - C++
C++ DataFrame for statistical, financial, and ML analysis in modern C++
-
Updated
Apr 7, 2026 - C++
ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.
-
Updated
Apr 8, 2026 - C++
The Universal Storage Engine
-
Updated
Mar 30, 2026 - C++
BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.
-
Updated
Sep 16, 2022 - C++
A library created to revitalize C++ as a machine learning front end. Per aspera ad astra.
-
Updated
Feb 25, 2022 - C++
Epsilla is a high performance Vector Database Management System
-
Updated
Nov 29, 2025 - C++
Tree-Boosting, Gaussian Processes, and Mixed-Effects Models
-
Updated
Apr 8, 2026 - C++
Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with the Python Database API Specification 2.0.
-
Updated
Apr 6, 2026 - C++
oneAPI Data Analytics Library (oneDAL)
-
Updated
Apr 8, 2026 - C++
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.
-
Updated
Apr 9, 2026 - C++
LabPlot is a FREE, open source and cross-platform Data Visualization and Analysis software accessible to everyone.
-
Updated
Apr 9, 2026 - C++
A Lean Persistent Homology Library for Python
-
Updated
Apr 7, 2026 - C++
A visualisation tool for the creation and analysis of graphs
-
Updated
Dec 23, 2025 - C++
Fast, high-quality forecasts on relational and multivariate time-series data powered by new feature learning algorithms and automated ML.
-
Updated
Oct 27, 2025 - C++
- Followers
- 4.4k followers
- Website
- github.com/topics/data-science
- Wikipedia
- Wikipedia