Coursera Learner working on a presentation with Coursera logo and

Beyond LLMs and Trillion-Parameter Models

Coursera Learner working on a presentation with Coursera logo and

These days, AI seems synonymous with GenAI (generative AI), LLMs (large language models), and ultra-large models. This focus has overshadowed other areas like computer vision and voice AI. The success of trillion-parameter models is partly due to their over-parameterization, where various parameter combinations can yield satisfactory results. Essentially, with enough hidden layers and GPU power, impressive outcomes are often inevitable, though the exact reasons remain unclear.

I’ve noticed a similar trend with some of my non-neural network methods. They perform best when granularity is maximized. Techniques that use combinatorial resampling excel with just a few million iterations, exploring a small fraction of the sample space. Their performance is particularly impressive in high-dimensional spaces, leveraging the sparsity of the feature space. Without this, the methods would require an impractical number of iterations, making them infeasible.

Modern LLMs Are Not the Panacea

Despite their impressive performance, my experience with OpenAI’s GPT has been underwhelming. While it surpasses many alternatives, it fails to meet my needs, particularly for research questions. GPT doesn’t provide the references it uses, even when requested. My frustrations, shared by many, are detailed in my LinkedIn post, “My New LLM Project.”

As a result, I need to develop my own tool. OpenAI’s reluctance to share sources, possibly due to copyright concerns, and the verbosity of GPT’s answers are major drawbacks. I prefer concise bullet points and links over lengthy, beginner-friendly responses. My solution will be simpler, not requiring neural networks or polished English, and will focus on specialized sources in mathematics.

Research Topics Beyond LLMs

New developments outside of LLMs receive less funding and attention, but interest is growing in solutions that are less resource-intensive. My research focuses on faster techniques that deliver better results, including improved evaluation metrics that avoid false negatives common in generating synthetic tabular data. These methods produce replicable, explainable results and are less sensitive to seed choices. Detailed case studies are available on my blog.

The State of Traditional Machine Learning

Traditional machine learning is not obsolete. Some of my research still involves resampling and regression, but with significant advancements. For example, my “cloud regression” technique operates without a dependent feature, performing supervised and unsupervised regression or clustering under one framework. It handles various regression types, including regularization methods like Lasso and ridge, and is model-free, even for multivariate confidence bands. My loss functions are more generic than those in modern LLMs, and my gradient descent method is original, math-free, and parameter-free.

New Types of Visualizations

Data visualization remains a traditional research area. While many focus on dashboards, I’ve developed powerful visualizations that summarize complex, high-dimensional data with 2D scatterplots. My latest innovation is a type of data animation applicable in various contexts, including GenAI.

To demonstrate, I applied the cloud regression procedure on 500 training sets shaped as ellipses, each with specific parameters and varying noise levels. The goal was to evaluate the curve fitting technique’s effectiveness across different data sets. The continuous path in the parameter space ensures smooth transitions between training sets in the animation, covering numerous combinations. The blue curves represent estimates of the unknown theoretical ellipses.

Languages

Weekly newsletter

No spam. Just the latest releases and tips, interesting articles, and exclusive interviews in your inbox every week.