Math3ma

a math blog

Language, Statistics, & Category Theory, Part 2

July 7, 2021

•

Part 1 of this mini-series opened with the observation that language is an algebraic structure. But we also mentioned that thinking merely algebraically doesn't get us very far. The algebraic perspective, for instance, is not sufficient to describe the passage from probability distributions on corpora of text to syntactic and semantic information in language that wee see in today's large language models. This motivated the category theoretical framework presented in a new paper I shared last time. But even before we bring statistics into the picture, there are some immediate advantages to using tools from category theory rather than algebra. One example comes from elementary considerations of logic, and that's where we'll pick up today.

Let's start with a brief recap.

Language, Statistics, & Category Theory, Part 1

June 29, 2021

•

Category Theory

In the previous post I mentioned a new preprint that John Terilla, Yiannis Vlassopoulos, and I recently posted on the arXiv. In it, we ask a question motivated by the recent successes of the world's best large language models:

What's a nice mathematical framework in which to explain the passage from probability distributions on text to syntactic and semantic information in language?

To understand the motivation behind this question, and to recall what a "large language model" is, I'll encourage you to read the opening article from last time. In the next few blog posts, I'll give a tour of mathematical ideas presented in the paper towards answering the question above. I like the narrative we give, so I'll follow it closely here on the blog. You might think of the next few posts as an informal tour through the formal ideas found in the paper.

Now, where shall we begin? What math are we talking about?

Let's start with a simple fact about language.

Language is algebraic.

By "algebraic," I mean the basic sense in which things combine to form a new thing. We learn about algebra at a young age: given two numbers $x$ and $y$ we can multiply them to get a new number $xy$. We can do something similar in language. Numbers combine to give new numbers, and words and phrases in a language combine to give new expressions. Take the words red and firetruck, for example. They can be "multiplied" together to get a new phrase: red firetruck.

Here, the "multiplication" is just concatenation — sticking things side by side. This is a simple algebraic structure, and it's inherent to language. I'm concatenating words together as I type this sentence. That's algebra! Another word for this kind of structure is compositionality, where things compose together to form something larger.

So language is algebraic or compositional.

A Nod to Non-Traditional Applied Math

June 24, 2021

•

Other

What is applied mathematics? The phrase might bring to mind historical applications of analysis to physical problems, or something similar. I think that's often what folks mean when they say "applied mathematics." And yet there's a much broader sense in which mathematics is applied, especially nowadays. I like what mathematician Tom Leinster once had to say about this (emphasis mine):

"I hope mathematicians and other scientists hurry up and realize that there’s a glittering array of applications of mathematics in which non-traditional areas of mathematics are applied to non-traditional problems. It does no one any favours to keep using the term 'applied mathematics' in its current overly narrow sense."

I'm all in favor of rebranding the term "applied mathematics" to encompass this wider notion. I certainly enjoy applying non-traditional areas of mathematics to non-traditional problems — it's such a vibrant place to be! It's especially fun to take ideas that mathematicians already know lots about, then repurpose those ideas for potential applications in other domains. In fact, I plan to spend some time sharing one such example with you here on the blog.

But before sharing the math— which I'll do in the next couple of blog posts — I want to first motivate the story by telling you about an idea from the field of artificial intelligence (AI).

Linear Algebra for Machine Learning

June 17, 2021

•

Algebra

The TensorFlow channel on YouTube recently uploaded a video I made on some elementary ideas from linear algebra and how they're used in machine learning (ML). It's a very nontechnical introduction — more of a bird's-eye view of some basic concepts and standard applications — with the simple goal of whetting the viewer's appetite to learn more.

I've decided to share it here, too, in case it may be of interest to anyone!

I imagine the content here might be helpful for undergraduate students who are in their first exposure to linear algebra and/or to ML, or for anyone else who's new to the topic and wants to get an idea for what it is and some ways it's used.

The video covers three basic concepts — vectors and matrix factorizations and eigenvectors/eigenvalues — and explains a few ways these concepts arise in ML — namely, as data representations, to find vector embeddings, and for dimensionality reduction techniques, respectively.

Enjoy!

Warming Up to Enriched Category Theory, Part 2

June 10, 2021

•

Category Theory

Let's jump right in to where we left off in part 1 of our warm-up to enriched category theory. If you'll recall from last time, we saw that the set of truth values $\{0, 1\}$ and the unit interval $[0,1]$ and the nonnegative extended reals $[0,\infty]$ were not just sets but actually preorders and hence categories. We also hinted at the idea that a "category enriched over" one of these preorders (whatever that means — we hadn't defined it yet!) looks something like a collection of objects $X,Y,\ldots$ where there is at most one arrow between any pair $X$ and $Y$, and where that arrow can further be "decorated with" —or simply replaced by — a number from one of those three exemplary preorders.

With that background in mind, my goal in today's article is to say exactly what a category enriched over a preorder is. The formal definition — and the intuition behind it — will then pave the way for the notion of a category enriched over an arbitrary (and sufficiently nice) category, not just a preorder.

En route to this goal, it will help to make a couple of opening remarks.

Two things to think about.

First, take a closer look at the picture on the right. I've written "$\text{hom}(X,Y)$" in quotation marks because the notation $\text{hom}(-,-)$ is often used for a set of morphisms in ordinary category theory. But the point of this discussion is that we're not just interested in sets! So we should use better notation: let's refer to the number associated to a pair of objects $XY$ and $Y$ as $\mathcal{C}(X,Y)$, where the letter "$\mathcal{C}$" reminds us there's an (enriched) $\mathcal{C}$ategory being investigated.

Second, for the theory to work out nicely, it turns out that preorders need a little more added to them.