Shubhendu Trivedi -- TTIC/University of Chicago

(Still avoiding mugshots!)

About me, research interests, background etc.

Note (30/11/2019): This is an old webpage and may not be updated, check out my new website.

I am currently a Research Associate at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) where I work with professors Regina Barzilay and Tommi Jaakkola on causal learning and representation learning for graph-structured data, with a particular focus on applications to drug discovery. Prior to this, I was the NSF institute fellow at the Institute for Computational and Experimental Research in Mathematics at Brown University between October 2018 to June 2019, while simultaneously also a research affiliate with the CSAIL at MIT. At Brown, I was a researcher-in-residence for their programs on Nonlinear Algebra and Computer Vision. At MIT, I was attached with the group of Prof. Regina Barzilay.

I completed my PhD in August 2018, with a thesis on similarity learning, metric estimation and group covariant neural networks. My dissertation committee comprised of Kevin Gimpel, Risi Kondor, Brian D. Nord and Gregory Shakhnarovich (Here is a post-defense picture with the committee along with an honorary committee member!). During my PhD, I was very fortunate to work under the tutelage of Prof. Gregory Shakhnarovich at the Toyota Technological Institute at Chicago. I also worked very closely with Prof. Risi Kondor at the Departments of Statistics and Computer Science at The University of Chicago as well as Prof. Brian D. Nord at the Kavli Institute for Cosmological Physics and Fermilab (Group: Deep Skies Lab). During the course of my PhD, I also had the unusual and enriching experience of getting to design, prepare and teach a large graduate course in deep learning (jointly with Prof. Kondor -- also read this Symmetry magazine article that mentions our class). My last industrial research internship during graduate school was at NEC Labs America, where I was mentored by Dr. Ryohei Fujimaki, for work on robust optimization.

I have broad interests in Machine Learning. In particular, I have a predilection for (deep and otherwise) representation learning, structured prediction and general semi/weakly/self supervised learning. Currently, I have been exploring problems in the supervised learning of similarity and distance in low-shot regimes, as well as learning representations for combinatorial structures such as graphs and sets. Some of my recent efforts have been in the design and implementation of neural architectures that either have task pertinent symmetries baked in them using the machinery of group and representation theory aka group-equivariant neural networks, or attempt to learn them from data. Such networks provide a rational and attractive design precept for the principled design of neural networks, while also affording significant data efficiency. I am also very interested in and seek inspiration from applications of machine learning in computer vision, and more recently, the physical sciences, especially in computational chemistry and physics. I also maintain an amateur interest in extremal combinatorics and spectral graph theory from a past life.

+ Some background:: Prior to PhD candidacy, I completed a MS (focusing on Machine Learning). Before that, in what now seems like a past life, I worked on problems in educational analytics, clustering and ensemble learning under the supervision of Professors Neil T. Heffernan and Gábor N. Sárközy earning another MS (in Computer Science, here's the proof!) with a thesis (Prof. Sonia Chernova was the reader) that presented a new clustering algorithm based on Szemerédi Regularity Lemma and also a method somewhat similar to mixture of experts using clustering for ensemble learning. Further afield, I worked in the industry in the signal processing domain (Application Specific Integrated Circuits) for roughly one year after acquiring an undergraduate degree in Electronics and Communications Engineering. While working I also helped my undergraduate advisor, Dr. (Mrs) K. R. Joshi, in teaching three senior year courses. During my undergrad, I worked on biometrics (face and speech recognition - using subspace projection methods for the former and dynamic programming for the latter). At the same time I also worked on blind source separation with applications to magnetic resonance image denoising.

Contact:

email: shubhendu@{brown OR csail.mit OR cs.uchicago OR ttic}.edu

Research reports

The Expected Jacobian Outerproduct: Theory and Empirics
S. Trivedi and J. Wang.
Technical Report.
arXiv preprint, 2020

Asymmetric Multiresolution Matrix Factorization
Pramod Kaushik Mudrakarta, Shubhendu Trivedi and Risi Kondor.
Technical Report.
arXiv preprint, arXiv:1910.05132, 2019

Deep Learning for Automated Classification and Characterization of Amorphous Materials
Kirk Swanson, Shubhendu Trivedi, Joshua Lequieu, Kyle Swanson and Risi Kondor.
Soft Matter, The Royal Society of Chemistry, 2019
arXiv preprint arXiv:1909.04648

DeepCMB: Lensing Reconstruction of the Cosmic Microwave Background with Deep Neural Networks
Joao Caldeira, W. L. Kimmy Wu, Brian D. Nord, Camille Avestruz, Shubhendu Trivedi and Kyle Story.
Astronomy and Computing, https://doi.org/10.1016/j.ascom.2019.100307, 2019
arXiv preprint arXiv:1810.01483

Discriminative Learning of Similarity and Group-Equivariant Representations
Shubhendu Trivedi.
PhD Thesis. 2018
arXiv preprint arXiv:1808.10078

Clebsch-Gordan Networks: A Fully Fourier Space Spherical Convolutional Neural Network
Risi Kondor^†, Zhen Lin^† and Shubhendu Trivedi^†.
Neural Information Processing Systems (NIPS) 2018, Montreal, Canada.
arXiv preprint arXiv:1806.09231 (PDF)
[PyTorch Code]
^† denotes alphabetical author ordering

On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups
Risi Kondor and Shubhendu Trivedi.
International Conference on Machine Learning (ICML) 2018, Stockholm, Sweden
arXiv preprint arXiv:1802.03690 (PDF)

Predicting Molecular Properties with Covariant Compositional Networks
Hy Truong Son, Shubhendu Trivedi, Horace Pan, Brandon M. Anderson and Risi Kondor.
The Journal of Chemical Physics (JCP) 148, 241745, American Institute of Physics Publishing, 2018

Covariant Compositional Networks for Learning Graphs
Risi Kondor^†, Hy Truong Son^†, Horace Pan^†, Brandon M. Anderson^† and Shubhendu Trivedi^†.
International Conference on Learning Representations (ICLR) 2018 - WS Track, Vancouver, Canada
[PyTorch Code]
arXiv preprint arXiv:1801.02144 (PDF)
^† denotes arbitrary author ordering

Identication and measurement of galaxy cluster properties in millimeter wave maps using deep learning
W. L. Kimmy Wu, Brian D. Nord and Shubhendu Trivedi.
Submitted

Cross-Encoders: Learning Physics from Images
Joao Caldeira, W. L. Kimmy Wu, Camille Avestruz, Brian D. Nord, Shubhendu Trivedi and Kyle Story.
Submitted

The Jacobian Outerproduct
Shubhendu Trivedi^† and Jialei Wang^†.
^† denotes alphabetical author ordering

The Utility of Clustering in Prediction Tasks
Shubhendu Trivedi, Zachary A. Pardos, Neil T. Heffernan.
Technical Report
arXiv version: arXiv 1509.06163

Discriminative Metric Learning by Neighborhood Gerrymandering
Shubhendu Trivedi, David McAllester, Gregory Shakhnarovich.
Neural Information Processing Systems (NIPS) 2014, Montreal, Canada.

A Consistent Estimator of the Expected Gradient Outerproduct
Shubhendu Trivedi^‡, Jialei Wang^‡, Samory Kpotufe, Gregory Shakhnarovich.
Uncertainity in Artificial Intelligence (UAI) 2014, Quebec City, Canada.
^‡ denotes equal contribution

Applying Clustering to the Problem of Predicting Retention within an ITS: Comparing Regularity Clustering with Traditional Methods.
Fei Song, Shubhendu Trivedi, Yu Tao Wang, Gábor N. Sárközy, Neil T. Heffernan.
AAAI FLAIRS 2013, St. Pete Beach, FL, United States. (older version)

A Graph-Theoretic Clustering Algorithm based on the Regularity Lemma and Strategies to Exploit Clustering for Prediction
Shubhendu Trivedi.
M. S. Thesis, 2012

A Practical Regularity Partitioning Algorithm and its Applications in Clustering
Gábor N. Sárközy^†, Fei Song^†, Endre Szemerédi^†, Shubhendu Trivedi^†.
arXiv preprint arXiv:1209.6540, 2012
^† denotes alphabetical author ordering

The real world significance of performance prediction
Zachary A. Pardos, Qing Yang Wang, Shubhendu Trivedi.
Educational Data Mining (EDM) 2012, Chania, Greece

Co-Clustering by Bipartite Spectral Graph Partitioning for Out-of-Tutor Prediction
Shubhendu Trivedi, Zachary A. Pardos, Gábor N. Sárközy, Neil T. Heffernan.
Educational Data Mining (EDM) 2012, Chania, Greece

Clustered Knowledge Tracing
Zachary A. Pardos, Shubhendu Trivedi, Neil T. Heffernan, Gábor N. Sárközy.
Intelligent Tutoring Systems (ITS) 2012, Chania, Greece

Spectral Clustering in Educational Data Mining
Shubhendu Trivedi, Zachary A. Pardos, Gábor N. Sárk̈zy, Neil T. Heffernan.
Educational Data Mining (EDM) 2011, Eindhoven, Netherlands

Clustering students to generate an ensemble to improve standard test score predictions
Shubhendu Trivedi, Zachary A. Pardos, Neil T. Heffernan.
Artificial Intelligence in Education (AIEd) 2011, Auckland, New Zealand

Notes/Unpublished Works/Theses

Slides : An introduction to Koopman Operators

Notes on Asymmetric Metric Learning for kNN Classification
Shubhendu Trivedi.
Notes, November 2015
Working document, PDF

A Graph-Theoretic Clustering Algorithm based on the Regularity Lemma and Strategies to Exploit Clustering for Prediction
Shubhendu Trivedi.
MS Thesis, 2012

Discriminative Learning of Similarity and Group Equivariant Representations
Shubhendu Trivedi.
Ph.D. Thesis, 2018

Patents

A Fully Fourier Space Spherical Convolutional Neural Network based on Clebsch-Gordan Transforms
R. Kondor, S. Trivedi and Z. Lin.
International Patent PCT/US2019/038236

Current collaborative projects and interests

Deep learning over point clouds and sets

Deep equivariant networks

Understanding the structure and dynamics of supercooled liquids and glasses using machine learning

Deep learning for detecting strong gravitational lensing

Low shot learning for combinatorial data

Teaching

I have taught undergraduate and graduate courses at various points and served as teaching assistant for about a dozen CS/Math/EE courses. Once in a while I have won awards for the same, the most recent being the best TA award in the CS department of The University of Chicago and getting a commendation from the physical sciences division.

As Instructor/Co-Instructor:

Graduate Course (University of Chicago, CS)
-- Deep Learning (CMSC 35246, Textbook: Bengio, Goodfellow, Courville; Course website; Jointly taught with Prof. Risi Kondor)
Undergraduate Courses (University of Pune, EE):
-- Introduction to Digital Image Processing (Textbook: Gonzalez and Woods; Jointly taught with Prof. K. R. Joshi)
-- Image and Signal Processing Lab
-- Introduction to Bioinformatics (mostly covered the part on data mining)

As Teaching Assistant:

Graduate Courses:
-- CS 534 Artificial Intelligence (Instructor: Dr. Neil T. Heffernan, Textbook: Russell and Norvig)
-- TTIC 31020 Introduction to Statistical Machine Learning (Instructor: Dr. Gregory Shakhnarovich)
Undergraduate Courses:
-- CS 4120 Analysis of Algorithms (Instructor: Dr. Gábor N. Sárközy, Textbook: CLRS/Kleinberg-Tardos)
-- CS 2223 Introduction to Algorithms wih Lua (Instructor: Dr. Joshua D. Guttman, Textbook: CLRS)
-- CS 3133 Foundations of Computer Science i.e Automata Theory (Instructor: Dr. Gábor N. Sárközy, Textbook: Sudkamp)
-- CS 4341 Introduction to Artificial Intelligence (Instructor: Dr. Neil T. Heffernan, Textbook: Russell and Norvig)
-- MA 2201 Discrete Mathematics (Instructor: Dr. Gábor N. Sárközy, Textbook: Kenneth Rosen)
-- CS 2223 Introduction to Algorithms wih Lua (Instructor: Dr. Joshua D. Guttman, Textbook: CLRS)
-- CS 3133 Foundations of Computer Science i.e Automata Theory (Instructor: Dr. Gábor N. Sárközy, Textbook: Sudkamp, Dexter Kozen)
-- CS 2011 Introduction to Machine Organization and Assembly Language (Instructor: Dr. Hugh C. Lauer, Textbook: Bryant and Halloran)
-- STAT 27725/CMSC 25400 Machine Learning (Instructor: Dr. Imre Risi Kondor)
(Slides from some lectures I gave in this course:
Discrete Probability Tutorial | Maximum Likelihood Estimation and Multivariate Gaussians
Artificial Neural Networks I | Artificial Neural Networks II)

Selected Courses

Graduate Courses:
Introduction to Statistical Machine Learning, Mathematical Foundations (type theory), Metric Geometry, Algorithms, Discrete Mathematics, Information Theory, Signals, systems and random processes, Speech technologies, Non-linear dynamical systems and chaos, Computability and complexity theory, Intelligent tutoring systems, Artificial Intelligence (with LISP), Automata Theory (Foundations of Computer Science), Numerical Linear Algebra, Combinatorics, Knowledge discovery and data mining, Logic in computer science etc.

Undergraduate Courses:
Very Large Scale Integration, Computer and voice networks, Optical and Microwave communication, Image processing, signal processing, Computer Architecture, Analog and Digital Communication, Advanced Microprocessors, Coding Theory, Power Electronics, Mechatronics, Electromagnetic fields, Network theory, Linear and non-linear control theory, Vector calculus, Real analysis, Abstract algebra, Ordinary differential equations, Elementary differential geometry etc.

Service etc.

Referee/Reviewing:
Neural Information Processing Systems (NeurIPS), User Modeling, Adaptation, and Personalization (UMAP), IEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), IEEE Transactions on Information Theory, IEEE Transactions on Medical Imaging, International Conference on Machine Learning (ICML), International Conference on Learning Representations (ICLR), Computer Vision and Patten Recognition (CVPR), International Conference on Computer Vision (ICCV) etc.

Miscellaneous

My Erdős Number is 2*. My Bacon Number is ∞. I don't eat Bacon.
*Paths (listing on the Erdős number project):
1. Shubhendu Trivedi (2011) ← Gábor N. Sárközy (1997) ← Paul Erdős (1932)
2. Shubhendu Trivedi (2012) ← Endre Szemerédi (1966) ← Paul Erdős (1932)

Elsewhere on the Internet:

-- Google Scholar
-- Onionesque Reality (a dormant blog, mostly on random things)
-- Goodreads (again, not too frequently updated, it is hard to catch up with my own reading speed ;)
-- Twitter (mostly ML related)