Shubhendu Trivedi

(Still avoiding mugshots!)

About me, research interests, background etc.

Note (30/11/2019): This is an old webpage and may not be updated, check out my new website.

I am currently a Research Associate at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) where I work with professors Regina Barzilay and Tommi Jaakkola on causal learning and representation learning for graph-structured data, with a particular focus on applications to drug discovery. Prior to this, I was the NSF institute fellow at the Institute for Computational and Experimental Research in Mathematics at Brown University between October 2018 to June 2019, while simultaneously also a research affiliate with the CSAIL at MIT. At Brown, I was a researcher-in-residence for their programs on Nonlinear Algebra and Computer Vision. At MIT, I was attached with the group of Prof. Regina Barzilay.

I completed my PhD in August 2018, with a thesis on similarity learning, metric estimation and group covariant neural networks. My dissertation committee comprised of Kevin Gimpel, Risi Kondor, Brian D. Nord and Gregory Shakhnarovich (Here is a post-defense picture with the committee along with an honorary committee member!). During my PhD, I was very fortunate to work under the tutelage of Prof. Gregory Shakhnarovich at the Toyota Technological Institute at Chicago. I also worked very closely with Prof. Risi Kondor at the Departments of Statistics and Computer Science at The University of Chicago as well as Prof. Brian D. Nord at the Kavli Institute for Cosmological Physics and Fermilab (Group: Deep Skies Lab). During the course of my PhD, I also had the unusual and enriching experience of getting to design, prepare and teach a large graduate course in deep learning (jointly with Prof. Kondor -- also read this Symmetry magazine article that mentions our class). My last industrial research internship during graduate school was at NEC Labs America, where I was mentored by Dr. Ryohei Fujimaki, for work on robust optimization.

I have broad interests in Machine Learning. In particular, I have a predilection for (deep and otherwise) representation learning, structured prediction and general semi/weakly/self supervised learning. Currently, I have been exploring problems in the supervised learning of similarity and distance in low-shot regimes, as well as learning representations for combinatorial structures such as graphs and sets. Some of my recent efforts have been in the design and implementation of neural architectures that either have task pertinent symmetries baked in them using the machinery of group and representation theory aka group-equivariant neural networks, or attempt to learn them from data. Such networks provide a rational and attractive design precept for the principled design of neural networks, while also affording significant data efficiency. I am also very interested in and seek inspiration from applications of machine learning in computer vision, and more recently, the physical sciences, especially in computational chemistry and physics. I also maintain an amateur interest in extremal combinatorics and spectral graph theory from a past life.

+ Some background:
Prior to PhD candidacy, I completed a MS (focusing on Machine Learning). Before that, in what now seems like a past life, I worked on problems in educational analytics, clustering and ensemble learning under the supervision of Professors Neil T. Heffernan and Gábor N. Sárközy earning another MS (in Computer Science, here's the proof!) with a thesis (Prof. Sonia Chernova was the reader) that presented a new clustering algorithm based on Szemerédi Regularity Lemma and also a method somewhat similar to mixture of experts using clustering for ensemble learning. Further afield, I worked in the industry in the signal processing domain (Application Specific Integrated Circuits) for roughly one year after acquiring an undergraduate degree in Electronics and Communications Engineering. While working I also helped my undergraduate advisor, Dr. (Mrs) K. R. Joshi, in teaching three senior year courses. During my undergrad, I worked on biometrics (face and speech recognition - using subspace projection methods for the former and dynamic programming for the latter). At the same time I also worked on blind source separation with applications to magnetic resonance image denoising.


Research reports

  • The Expected Jacobian Outerproduct: Theory and Empirics
    S. Trivedi and J. Wang.
    Technical Report.
    arXiv preprint, 2020

  • Asymmetric Multiresolution Matrix Factorization
    Pramod Kaushik Mudrakarta, Shubhendu Trivedi and Risi Kondor.
    Technical Report.
    arXiv preprint, arXiv:1910.05132, 2019

  • Deep Learning for Automated Classification and Characterization of Amorphous Materials
    Kirk Swanson, Shubhendu Trivedi, Joshua Lequieu, Kyle Swanson and Risi Kondor.
    Soft Matter, The Royal Society of Chemistry, 2019
    arXiv preprint arXiv:1909.04648

  • DeepCMB: Lensing Reconstruction of the Cosmic Microwave Background with Deep Neural Networks
    Joao Caldeira, W. L. Kimmy Wu, Brian D. Nord, Camille Avestruz, Shubhendu Trivedi and Kyle Story.
    Astronomy and Computing,, 2019
    arXiv preprint arXiv:1810.01483

  • Discriminative Learning of Similarity and Group-Equivariant Representations
    Shubhendu Trivedi.
    PhD Thesis. 2018
    arXiv preprint arXiv:1808.10078

  • Clebsch-Gordan Networks: A Fully Fourier Space Spherical Convolutional Neural Network
    Risi Kondor, Zhen Lin and Shubhendu Trivedi.
    Neural Information Processing Systems (NIPS) 2018, Montreal, Canada.
    arXiv preprint arXiv:1806.09231 (PDF)
    [PyTorch Code]
    denotes alphabetical author ordering

  • On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups
    Risi Kondor and Shubhendu Trivedi.
    International Conference on Machine Learning (ICML) 2018, Stockholm, Sweden
    arXiv preprint arXiv:1802.03690 (PDF)

  • Predicting Molecular Properties with Covariant Compositional Networks
    Hy Truong Son, Shubhendu Trivedi, Horace Pan, Brandon M. Anderson and Risi Kondor.
    The Journal of Chemical Physics (JCP) 148, 241745, American Institute of Physics Publishing, 2018

  • Covariant Compositional Networks for Learning Graphs
    Risi Kondor, Hy Truong Son, Horace Pan, Brandon M. Anderson and Shubhendu Trivedi.
    International Conference on Learning Representations (ICLR) 2018 - WS Track, Vancouver, Canada
    [PyTorch Code]
    arXiv preprint arXiv:1801.02144 (PDF)
    denotes arbitrary author ordering

  • Identication and measurement of galaxy cluster properties in millimeter wave maps using deep learning
    W. L. Kimmy Wu, Brian D. Nord and Shubhendu Trivedi.

  • Cross-Encoders: Learning Physics from Images
    Joao Caldeira, W. L. Kimmy Wu, Camille Avestruz, Brian D. Nord, Shubhendu Trivedi and Kyle Story.

  • The Jacobian Outerproduct
    Shubhendu Trivedi and Jialei Wang.
    denotes alphabetical author ordering

  • The Utility of Clustering in Prediction Tasks
    Shubhendu Trivedi, Zachary A. Pardos, Neil T. Heffernan.
    Technical Report
    arXiv version: arXiv 1509.06163

  • Discriminative Metric Learning by Neighborhood Gerrymandering
    Shubhendu Trivedi, David McAllester, Gregory Shakhnarovich.
    Neural Information Processing Systems (NIPS) 2014, Montreal, Canada.

  • A Consistent Estimator of the Expected Gradient Outerproduct
    Shubhendu Trivedi, Jialei Wang, Samory Kpotufe, Gregory Shakhnarovich.
    Uncertainity in Artificial Intelligence (UAI) 2014, Quebec City, Canada.
    denotes equal contribution

  • Applying Clustering to the Problem of Predicting Retention within an ITS: Comparing Regularity Clustering with Traditional Methods.
    Fei Song, Shubhendu Trivedi, Yu Tao Wang, Gábor N. Sárközy, Neil T. Heffernan.
    AAAI FLAIRS 2013, St. Pete Beach, FL, United States. (older version)

  • A Graph-Theoretic Clustering Algorithm based on the Regularity Lemma and Strategies to Exploit Clustering for Prediction
    Shubhendu Trivedi.
    M. S. Thesis, 2012

  • A Practical Regularity Partitioning Algorithm and its Applications in Clustering
    Gábor N. Sárközy, Fei Song, Endre Szemerédi, Shubhendu Trivedi.
    arXiv preprint arXiv:1209.6540, 2012
    denotes alphabetical author ordering

  • The real world significance of performance prediction
    Zachary A. Pardos, Qing Yang Wang, Shubhendu Trivedi.
    Educational Data Mining (EDM) 2012, Chania, Greece

  • Co-Clustering by Bipartite Spectral Graph Partitioning for Out-of-Tutor Prediction
    Shubhendu Trivedi, Zachary A. Pardos, Gábor N. Sárközy, Neil T. Heffernan.
    Educational Data Mining (EDM) 2012, Chania, Greece

  • Clustered Knowledge Tracing
    Zachary A. Pardos, Shubhendu Trivedi, Neil T. Heffernan, Gábor N. Sárközy.
    Intelligent Tutoring Systems (ITS) 2012, Chania, Greece

  • Spectral Clustering in Educational Data Mining
    Shubhendu Trivedi, Zachary A. Pardos, Gábor N. Sárk̈zy, Neil T. Heffernan.
    Educational Data Mining (EDM) 2011, Eindhoven, Netherlands

  • Clustering students to generate an ensemble to improve standard test score predictions
    Shubhendu Trivedi, Zachary A. Pardos, Neil T. Heffernan.
    Artificial Intelligence in Education (AIEd) 2011, Auckland, New Zealand

    Notes/Unpublished Works/Theses

  • Slides : An introduction to Koopman Operators

  • Notes on Asymmetric Metric Learning for kNN Classification
    Shubhendu Trivedi.
    Notes, November 2015
    Working document, PDF

  • A Graph-Theoretic Clustering Algorithm based on the Regularity Lemma and Strategies to Exploit Clustering for Prediction
    Shubhendu Trivedi.
    MS Thesis, 2012

  • Discriminative Learning of Similarity and Group Equivariant Representations
    Shubhendu Trivedi.
    Ph.D. Thesis, 2018


  • A Fully Fourier Space Spherical Convolutional Neural Network based on Clebsch-Gordan Transforms
    R. Kondor, S. Trivedi and Z. Lin.
    International Patent PCT/US2019/038236

    Current collaborative projects and interests

  • Deep learning over point clouds and sets
  • Deep equivariant networks
  • Understanding the structure and dynamics of supercooled liquids and glasses using machine learning
  • Deep learning for detecting strong gravitational lensing
  • Low shot learning for combinatorial data


    I have taught undergraduate and graduate courses at various points and served as teaching assistant for about a dozen CS/Math/EE courses. Once in a while I have won awards for the same, the most recent being the best TA award in the CS department of The University of Chicago and getting a commendation from the physical sciences division.

    As Instructor/Co-Instructor:

    Graduate Course (University of Chicago, CS)
    -- Deep Learning (CMSC 35246, Textbook: Bengio, Goodfellow, Courville; Course website; Jointly taught with Prof. Risi Kondor)
    Undergraduate Courses (University of Pune, EE):
    -- Introduction to Digital Image Processing (Textbook: Gonzalez and Woods; Jointly taught with Prof. K. R. Joshi)
    -- Image and Signal Processing Lab
    -- Introduction to Bioinformatics (mostly covered the part on data mining)

    As Teaching Assistant:

    Graduate Courses:
    -- CS 534 Artificial Intelligence (Instructor: Dr. Neil T. Heffernan, Textbook: Russell and Norvig)
    -- TTIC 31020 Introduction to Statistical Machine Learning (Instructor: Dr. Gregory Shakhnarovich)
    Undergraduate Courses:
    -- CS 4120 Analysis of Algorithms (Instructor: Dr. Gábor N. Sárközy, Textbook: CLRS/Kleinberg-Tardos)
    -- CS 2223 Introduction to Algorithms wih Lua (Instructor: Dr. Joshua D. Guttman, Textbook: CLRS)
    -- CS 3133 Foundations of Computer Science i.e Automata Theory (Instructor: Dr. Gábor N. Sárközy, Textbook: Sudkamp)
    -- CS 4341 Introduction to Artificial Intelligence (Instructor: Dr. Neil T. Heffernan, Textbook: Russell and Norvig)
    -- MA 2201 Discrete Mathematics (Instructor: Dr. Gábor N. Sárközy, Textbook: Kenneth Rosen)
    -- CS 2223 Introduction to Algorithms wih Lua (Instructor: Dr. Joshua D. Guttman, Textbook: CLRS)
    -- CS 3133 Foundations of Computer Science i.e Automata Theory (Instructor: Dr. Gábor N. Sárközy, Textbook: Sudkamp, Dexter Kozen)
    -- CS 2011 Introduction to Machine Organization and Assembly Language (Instructor: Dr. Hugh C. Lauer, Textbook: Bryant and Halloran)
    -- STAT 27725/CMSC 25400 Machine Learning (Instructor: Dr. Imre Risi Kondor)
    (Slides from some lectures I gave in this course:
    Discrete Probability Tutorial | Maximum Likelihood Estimation and Multivariate Gaussians
    Artificial Neural Networks I | Artificial Neural Networks II)

    Selected Courses

    Graduate Courses:
    Introduction to Statistical Machine Learning, Mathematical Foundations (type theory), Metric Geometry, Algorithms, Discrete Mathematics, Information Theory, Signals, systems and random processes, Speech technologies, Non-linear dynamical systems and chaos, Computability and complexity theory, Intelligent tutoring systems, Artificial Intelligence (with LISP), Automata Theory (Foundations of Computer Science), Numerical Linear Algebra, Combinatorics, Knowledge discovery and data mining, Logic in computer science etc.

    Undergraduate Courses:
    Very Large Scale Integration, Computer and voice networks, Optical and Microwave communication, Image processing, signal processing, Computer Architecture, Analog and Digital Communication, Advanced Microprocessors, Coding Theory, Power Electronics, Mechatronics, Electromagnetic fields, Network theory, Linear and non-linear control theory, Vector calculus, Real analysis, Abstract algebra, Ordinary differential equations, Elementary differential geometry etc.

    Service etc.

    Neural Information Processing Systems (NeurIPS), User Modeling, Adaptation, and Personalization (UMAP), IEEE Transactions on Neural Networks and Learning Systems (IEEE TNNLS), IEEE Transactions on Pattern Analysis and Machine Intelligence (IEEE TPAMI), IEEE Transactions on Information Theory, IEEE Transactions on Medical Imaging, International Conference on Machine Learning (ICML), International Conference on Learning Representations (ICLR), Computer Vision and Patten Recognition (CVPR), International Conference on Computer Vision (ICCV) etc.


    My Erdős Number is 2*. My Bacon Number is ∞. I don't eat Bacon.
    *Paths (listing on the Erdős number project):
    1. Shubhendu Trivedi (2011) ← Gábor N. Sárközy (1997) ← Paul Erdős (1932)
    2. Shubhendu Trivedi (2012) ← Endre Szemerédi (1966) ← Paul Erdős (1932)

    Elsewhere on the Internet:

    -- Google Scholar
    -- Onionesque Reality (a dormant blog, mostly on random things)
    -- Goodreads (again, not too frequently updated, it is hard to catch up with my own reading speed ;)
    -- Twitter (mostly ML related)