(Still avoiding mugshots!)
About me, research interests, background etc.
Note (30/11/2019): This is an old webpage and may not be updated, check out my new website.
I am currently a Research Associate at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) where I work with professors Regina Barzilay
and Tommi Jaakkola on causal learning and representation learning for graph-structured data, with a particular focus on applications to drug discovery. Prior to this, I was
the NSF institute fellow at the Institute for Computational and Experimental Research in Mathematics at Brown University between October 2018 to June 2019, while simultaneously also a research affiliate with the CSAIL at MIT. At Brown, I was
a researcher-in-residence for their programs on Nonlinear Algebra and
Computer Vision. At MIT, I was attached with the group of Prof. Regina Barzilay.
I completed my PhD in August 2018, with a thesis on similarity learning, metric estimation and group covariant neural networks. My dissertation committee comprised of Kevin Gimpel, Risi Kondor,
Brian D. Nord and Gregory Shakhnarovich (Here is a
post-defense picture with the committee along with an honorary committee member!).
During my PhD, I was very fortunate to work under the tutelage of Prof. Gregory Shakhnarovich at the Toyota Technological Institute at Chicago.
I also worked very closely with Prof. Risi Kondor at the Departments of Statistics and Computer Science at The University of Chicago
as well as Prof. Brian D. Nord at the Kavli Institute for Cosmological Physics and Fermilab
(Group: Deep Skies Lab). During the course of my PhD, I also had the unusual and enriching experience of getting to design, prepare and teach a
large graduate course in deep learning (jointly with Prof. Kondor -- also read
this Symmetry magazine article that mentions our class).
My last industrial research internship during graduate school was at NEC Labs America, where I was mentored by Dr. Ryohei Fujimaki,
for work on robust optimization.
I have broad interests in Machine Learning. In particular, I have a predilection for (deep and otherwise) representation learning, structured prediction and general semi/weakly/self supervised learning.
Currently, I have been exploring problems in the supervised learning of similarity and distance in low-shot regimes, as well as learning representations for combinatorial structures such as graphs and sets.
Some of my recent efforts have been in the design and implementation of neural architectures that either have task pertinent symmetries baked in them
using the machinery of group and representation theory aka group-equivariant neural networks, or attempt to learn them from data. Such networks provide a rational and attractive design precept
for the principled design of neural networks, while also affording significant data efficiency. I am also very interested in and seek inspiration from applications of machine learning in computer
vision, and more recently, the physical sciences, especially
in computational chemistry and physics. I also maintain an amateur interest in extremal combinatorics and spectral graph theory from a past life.
-
+ Some background:
- Prior to PhD candidacy, I completed a MS (focusing on Machine Learning). Before that, in what now seems like a past life, I worked on problems in educational analytics, clustering and ensemble learning under the
supervision of Professors Neil T. Heffernan and Gábor N. Sárközy earning another MS (in Computer Science,
here's the proof!) with a thesis (Prof. Sonia Chernova was the reader) that presented a new clustering algorithm based on Szemerédi Regularity Lemma
and also a method somewhat similar to mixture of experts using clustering for ensemble learning. Further afield, I worked in the industry in the signal processing domain (Application Specific Integrated Circuits) for roughly
one year after acquiring an undergraduate degree in Electronics and Communications Engineering. While working I also helped my undergraduate advisor, Dr. (Mrs) K. R. Joshi, in teaching three senior year courses.
During my undergrad, I worked on biometrics (face and speech recognition - using subspace projection methods for the former and dynamic programming for the latter). At the same time I also worked on blind source separation with applications to magnetic resonance image denoising.
Contact:
- email: shubhendu@{brown OR csail.mit OR cs.uchicago OR ttic}.edu
Research reports
The Expected Jacobian Outerproduct: Theory and Empirics
S. Trivedi and J. Wang.
Technical Report.
arXiv preprint, 2020
Asymmetric Multiresolution Matrix Factorization
Pramod Kaushik Mudrakarta, Shubhendu Trivedi and Risi Kondor.
Technical Report.
arXiv preprint, arXiv:1910.05132, 2019
Deep Learning for Automated Classification and Characterization of Amorphous Materials
Kirk Swanson, Shubhendu Trivedi, Joshua Lequieu, Kyle Swanson and Risi Kondor.
Soft Matter, The Royal Society of Chemistry, 2019
arXiv preprint arXiv:1909.04648
DeepCMB: Lensing Reconstruction of the Cosmic Microwave Background with Deep Neural Networks
Joao Caldeira, W. L. Kimmy Wu, Brian D. Nord, Camille Avestruz, Shubhendu Trivedi and Kyle Story.
Astronomy and Computing, https://doi.org/10.1016/j.ascom.2019.100307, 2019
arXiv preprint arXiv:1810.01483
Discriminative Learning of Similarity and Group-Equivariant Representations
Shubhendu Trivedi.
PhD Thesis. 2018
arXiv preprint arXiv:1808.10078
Clebsch-Gordan Networks: A Fully Fourier Space Spherical Convolutional Neural Network
Risi Kondor†, Zhen Lin† and Shubhendu Trivedi†.
Neural Information Processing Systems (NIPS) 2018, Montreal, Canada.
arXiv preprint arXiv:1806.09231 (PDF)
[PyTorch Code]
† denotes alphabetical author ordering
On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups
Risi Kondor and Shubhendu Trivedi.
International Conference on Machine Learning (ICML) 2018, Stockholm, Sweden
arXiv preprint arXiv:1802.03690 (PDF)
Predicting Molecular Properties with Covariant Compositional Networks
Hy Truong Son, Shubhendu Trivedi, Horace Pan, Brandon M. Anderson and Risi Kondor.
The Journal of Chemical Physics (JCP) 148, 241745, American Institute of Physics Publishing, 2018
Covariant Compositional Networks for Learning Graphs
Risi Kondor†, Hy Truong Son†, Horace Pan†, Brandon M. Anderson† and Shubhendu Trivedi†.
International Conference on Learning Representations (ICLR) 2018 - WS Track, Vancouver, Canada
[PyTorch Code]
arXiv preprint arXiv:1801.02144 (PDF)
† denotes arbitrary author ordering
Identication and measurement of galaxy cluster properties in millimeter wave maps using deep learning
W. L. Kimmy Wu, Brian D. Nord and Shubhendu Trivedi.
Submitted
Cross-Encoders: Learning Physics from Images
Joao Caldeira, W. L. Kimmy Wu, Camille Avestruz, Brian D. Nord, Shubhendu Trivedi and Kyle Story.
Submitted
The Jacobian Outerproduct
Shubhendu Trivedi† and Jialei Wang†.
† denotes alphabetical author ordering
The Utility of Clustering in Prediction Tasks
Shubhendu Trivedi, Zachary A. Pardos, Neil T. Heffernan.
Technical Report
arXiv version: arXiv 1509.06163
Discriminative Metric Learning by Neighborhood Gerrymandering
Shubhendu Trivedi, David McAllester, Gregory Shakhnarovich.
Neural Information Processing Systems (NIPS) 2014, Montreal, Canada.
A Consistent Estimator of the Expected Gradient Outerproduct
Shubhendu Trivedi‡, Jialei Wang‡, Samory Kpotufe, Gregory Shakhnarovich.
Uncertainity in Artificial Intelligence (UAI) 2014, Quebec City, Canada.
‡ denotes equal contribution
Applying Clustering to the Problem of Predicting Retention within an ITS: Comparing Regularity Clustering with Traditional Methods.
Fei Song, Shubhendu Trivedi, Yu Tao Wang, Gábor N. Sárközy, Neil T. Heffernan.
AAAI FLAIRS 2013, St. Pete Beach, FL, United States. (older version)
A Graph-Theoretic Clustering Algorithm based on the Regularity Lemma and Strategies to Exploit Clustering for Prediction
Shubhendu Trivedi.
M. S. Thesis, 2012
A Practical Regularity Partitioning Algorithm and its Applications in Clustering
Gábor N. Sárközy†, Fei Song†, Endre Szemerédi†, Shubhendu Trivedi†.
arXiv preprint arXiv:1209.6540, 2012
† denotes alphabetical author ordering
The real world significance of performance prediction
Zachary A. Pardos, Qing Yang Wang, Shubhendu Trivedi.
Educational Data Mining (EDM) 2012, Chania, Greece
Co-Clustering by Bipartite Spectral Graph Partitioning for Out-of-Tutor Prediction
Shubhendu Trivedi, Zachary A. Pardos, Gábor N. Sárközy, Neil T. Heffernan.
Educational Data Mining (EDM) 2012, Chania, Greece
Clustered Knowledge Tracing
Zachary A. Pardos, Shubhendu Trivedi, Neil T. Heffernan, Gábor N. Sárközy.
Intelligent Tutoring Systems (ITS) 2012, Chania, Greece
Spectral Clustering in Educational Data Mining
Shubhendu Trivedi, Zachary A. Pardos, Gábor N. Sárk̈zy, Neil T. Heffernan.
Educational Data Mining (EDM) 2011, Eindhoven, Netherlands
Clustering students to generate an ensemble to improve standard test score predictions
Shubhendu Trivedi, Zachary A. Pardos, Neil T. Heffernan.
Artificial Intelligence in Education (AIEd) 2011, Auckland, New Zealand
Notes/Unpublished Works/Theses
Slides : An introduction to Koopman Operators
Notes on Asymmetric Metric Learning for kNN Classification
Shubhendu Trivedi.
Notes, November 2015
Working document, PDF
A Graph-Theoretic Clustering Algorithm based on the Regularity Lemma and Strategies to Exploit Clustering for Prediction
Shubhendu Trivedi.
MS Thesis, 2012
Discriminative Learning of Similarity and Group Equivariant Representations
Shubhendu Trivedi.
Ph.D. Thesis, 2018
Patents
A Fully Fourier Space Spherical Convolutional Neural Network based on Clebsch-Gordan Transforms
R. Kondor, S. Trivedi and Z. Lin.
International Patent PCT/US2019/038236
Current collaborative projects and interests
Deep learning over point clouds and sets
Deep equivariant networks
Understanding the structure and dynamics of supercooled liquids and glasses using machine learning
Deep learning for detecting strong gravitational lensing
Low shot learning for combinatorial data
Teaching
I have taught undergraduate and graduate courses at various points and served as teaching assistant for about a dozen CS/Math/EE courses. Once in a while I have won awards for the same, the most recent being
the best TA award in the CS department of The University of Chicago and getting a commendation from the physical sciences division.
As Instructor/Co-Instructor:
Graduate Course (University of Chicago, CS)
-- Deep Learning (CMSC 35246, Textbook: Bengio, Goodfellow, Courville; Course website; Jointly taught with Prof. Risi Kondor)
Undergraduate Courses (University of Pune, EE):
-- Introduction to Digital Image Processing (Textbook: Gonzalez and Woods; Jointly taught with Prof. K. R. Joshi)
-- Image and Signal Processing Lab
-- Introduction to Bioinformatics (mostly covered the part on data mining)
As Teaching Assistant:
Graduate Courses:
-- CS 534 Artificial Intelligence (Instructor: Dr. Neil T. Heffernan, Textbook: Russell and Norvig)
-- TTIC 31020 Introduction to Statistical Machine Learning (Instructor: Dr. Gregory Shakhnarovich)
Undergraduate Courses:
-- CS 4120 Analysis of Algorithms (Instructor: Dr. Gábor N. Sárközy, Textbook: CLRS/Kleinberg-Tardos)
-- CS 2223 Introduction to Algorithms wih Lua (Instructor: Dr. Joshua D. Guttman, Textbook: CLRS)
-- CS 3133 Foundations of Computer Science i.e Automata Theory (Instructor: Dr. Gábor N. Sárközy, Textbook: Sudkamp)
-- CS 4341 Introduction to Artificial Intelligence (Instructor: Dr. Neil T. Heffernan, Textbook: Russell and Norvig)
-- MA 2201 Discrete Mathematics (Instructor: Dr. Gábor N. Sárközy, Textbook: Kenneth Rosen)
-- CS 2223 Introduction to Algorithms wih Lua (Instructor: Dr. Joshua D. Guttman, Textbook: CLRS)
-- CS 3133 Foundations of Computer Science i.e Automata Theory (Instructor: Dr. Gábor N. Sárközy, Textbook: Sudkamp, Dexter Kozen)
-- CS 2011 Introduction to Machine Organization and Assembly Language (Instructor: Dr. Hugh C. Lauer, Textbook: Bryant and Halloran)
-- STAT 27725/CMSC 25400 Machine Learning (Instructor: Dr. Imre Risi Kondor)
(Slides from some lectures I gave in this course:
Discrete Probability Tutorial |
Maximum Likelihood Estimation and Multivariate Gaussians
Artificial Neural Networks I | Artificial Neural Networks II)
Selected Courses
Graduate Courses:
Introduction to Statistical Machine Learning, Mathematical Foundations (type theory), Metric Geometry, Algorithms, Discrete Mathematics, Information Theory,
Signals, systems and random processes, Speech technologies, Non-linear dynamical systems and chaos, Computability and complexity theory, Intelligent tutoring systems, Artificial Intelligence (with LISP), Automata Theory
(Foundations of Computer Science), Numerical Linear Algebra, Combinatorics, Knowledge discovery and data mining, Logic in computer science etc.
Undergraduate Courses:
Very Large Scale Integration, Computer and voice networks, Optical and Microwave communication, Image processing, signal processing, Computer Architecture,
Analog and Digital Communication, Advanced Microprocessors, Coding Theory, Power Electronics, Mechatronics, Electromagnetic fields, Network theory, Linear and non-linear control theory, Vector calculus, Real analysis, Abstract
algebra, Ordinary differential equations, Elementary differential geometry etc.
Service etc.
Referee/Reviewing:
Neural Information Processing Systems (NeurIPS), User Modeling, Adaptation, and Personalization (UMAP), IEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI),
IEEE Transactions on Information Theory, IEEE Transactions on Medical Imaging, International Conference on Machine Learning (ICML), International Conference on Learning Representations (ICLR),
Computer Vision and Patten Recognition (CVPR), International Conference on Computer Vision (ICCV) etc.
Miscellaneous
My Erdős Number is 2*. My Bacon Number is ∞. I don't eat Bacon.
*Paths (listing on the Erdős number project):
1. Shubhendu Trivedi (2011) ← Gábor N. Sárközy (1997) ← Paul Erdős (1932)
2. Shubhendu Trivedi (2012) ← Endre Szemerédi (1966) ← Paul Erdős (1932)
Elsewhere on the Internet:
-- Google Scholar
-- Onionesque Reality (a dormant blog, mostly on random things)
-- Goodreads (again, not too frequently updated, it is hard to catch up with my own reading speed ;)
-- Twitter (mostly ML related)