Code & Software
Code and Software
CORDS: COResets and Data Subset selection
Github Link: https://github.com/decile-team/cords
Open-source toolkit implementing state of the art algorithms (many of them from our group) for coresets data subset selection.
Goal: Reduce end to end training time from days to hours and hours to minutes using coresets and data selection.
Algorithms implemented: GLISTER, CRAIG, Grad-Match, Submodular Selection (Facility Location, Feature Based Functions, Coverage, Diversity etc.), and Random Selection.
DISTIL: Deep dIverSied inTeractIve Learning
Github Link: https://github.com/decile-team/distil
DISTIL implements a number of state of the art active learning algorithms.
Some of the algorithms currently implemented with DISTIL include: Uncertainty Sampling, Margin Sampling, Least Condence Sampling, FASS, BADGE, GLISTER-Active, CoreSetAL, Random Sampling, and Submodular Sampling
SMTK: A Submodular Optimization Toolkit in C++
Joint work with Jeff Bilmes, Kai Wei, Yuzong Liu and several others (currently maintained by Melodi Lab, University of Washington)
Provided the first general purpose C++ toolkit for large scale submodular function optimization, which includes a large class of algorithms and commonly used submodular functions.
Has several memoization and implementation tricks to speed up the algorithms (including the implementations of the Lazy Greedy, Lazier than Lazy Greedy etc.)
Algorithms scale to massive datasets involving ground set sizes of several million instances.
Enables creating applications for several summarization (document/image/video) and data selection applications in a few lines of code!
Jensen: An Easily-Extensible C++ Toolkit for Production-Level Machine Learning and Convex Optimization (GitHub repo)
Github Link: https://github.com/decile-team/jensen
A modular framework for Convex optimization including several common convex functions and algorithms used in Machine Learning
Implements several convex functions like Logistic Loss, Hinge Loss etc. and most convex optimization algorithms including LBFGS, Trust Region Newton, LBFGS-Owl, Stochastic Gradient Descent, Nesterov’s optimal algorithm, Gradient Descent with various update rules, Conjugate gradient descent etc.
Implements several basic Machine Learning classifiers such as L1/L2 regularized Logistic Regression, SVMs, Probit Regression etc.
Sanjaya: A Scalable C++ deep video analytics engine (See this link)
Implements a scalable real time and post-mortem video analytics engine with several functionalities including object detection, face detection and recognition, human detection and human attribute recognition, vehicle detection and vehicle attribute recognition and face age/gender recognition, video summarization etc.
Integrates several open source software including OpenCV, Caffe, DarkNet, DLib and LibCCV, all in a single engine!
Ability to train customized object detection models and image classification models
Enables model finetuning and transfer learning
Supports live streams from surveillance cameras and several video file formats
Enables creating video analytics applications with a few lines of code!
Led to the development of two products Surakshavyuh (real time analytics and alerting) and Jigyasa (video search analytics). For more details on this, please visit this link.