So: last time, we talked about reductions in computer science. We certainly didn't exhaust the topic, but we at least got a definition on the board.

Now, there was one misleading thing that I didn't quite come out and say, but pretty strongly implied: I made it sound like the reason we would find a reduction of problem \(P\) to problem \(Q\), is that we have (or pretend we have for the purposes of proving something) a solution to problem \(Q\), and use that solution to give a solution to problem \(P\).

That is certainly

If this sounds like nonsense, I have a different way to describe this idea. The original explanation I gave can be thought of as saying

But one of the really annoying things about theoretical computer science is that a lot of problems lend themselves to having lower bounds proved about them in a very pretty, abstract way, while the way to show an upper bound is, basically, to write down the actual algorithm that achieves some upper bound on difficulty (again, whatever that word actually means).

In the current research project I'm involved in, for example, "easy" means "capable of being computed by an abstract computer". The question we're looking at has to do with when a universal-algebraic variety has an "easy" formal theory; however, if you look at the theorems we actually prove, they're almost all of the form "if such-and-such a structural property holds in an algebra, then the variety it generates has formal theory which is

Believe me, I'd love to have a theorem which shows that an interesting class has decidable formal theory. But those theorems aren't cheap. (As I mentioned, I spent most of last week's conference huddled with a senior colleague who thinks he has a road map for just such a proof... but right now it's just a treasure map with a lot of blank swaths.)

All of which is to say that the first example of a reduction that I wanted to share is one of this type: not one that succeeds in making a problem easy (by reducing it to a solved problem) but one that establishes that a problem is hard.

That's not the goal of this post, however. There's a lot of plumbing that needs to be done first -- not terribly exciting stuff, but without it you're not going to have happy guests. Specifically, I want to talk about the nuts and bolts of a formal theory, and how one would go about building a function which takes formulas from one theory and spits out formulas in a different theory written in a different language.

What do I mean by that? Let's take a very simple example: the statement \[ x < y \]Now, as it stands, that statement is neither true nor false, but that's because neither \(x\) nor \(y\) stands for an object.

There's one more obstacle, though: not only are \(x\) and \(y\) currently "open" variables, but

If we're going to strip everything down to the very basics, then, we start with symbols intended to convey basic assertions in our formal language. Traditionally, "=" is required to be one of these and all others are optional. Let's agree that our first formal language will have just one more assertion symbol, \(R(x,y)\). What does \(R\) mean?

That's the thing: we have complete freedom over what \(R\) means. Here are some possibilities:

Such a definition (an underlying set, together with a specification of what our symbol \(R(x,y)\) means) is called a (first-order)

And once we can assert a "simple statement", that is, \(x = y\) or \(R(x,y)\), we can assert a few kinds of

That's all the rules: every statement in our formal language gets built from simple statements and these five connectives. (For example, if we wanted to assert that some formula \( \varphi \) implies some other formula \( \psi \), we would write \( \psi \) OR (NOT \( \varphi \)), which is traditionally abbreviated \( \varphi \rightarrow \psi \).)

As an example of the first problem type: let's go back to a set we mentioned before: \[Z = \{ (x,y) \in \mathbb{R}^2 \colon \zeta(x + iy) = 0 \}\]the set of zeros of the Riemann Zeta Function. Is this a definable subset? Of course, this depends on what our basic language is, so let's say that the language allows us the following simple statements: "\(x + y = z\)", "\( x \cdot y = z\)", "\(x \leq y\)", and "\( e^x = y\)". It is a celebrated theorem in model theory that, while we can say quite a lot in this language, our set \(Z\) is

The setup: we have two classes, \( \mathcal{V} \) and \( \mathcal{W} \), each in a first-order language. (Note: probably

Second, for technical reasons that I'll explain later, we need to assume that \( \mathcal{V} \) is

And last, we assume that we have somehow found two special formulas \( \pi(x) \) and \( \rho(x,y) \) in the language of \( \mathcal{W} \), with the following property:

And now we can close the loop that we started at the beginning of this post: We're about to build a nontrivial reduction, from the set of formulas

To do that, we need to define our reduction: that is, our function \(f\) from \( \mathcal{V} \)-formulas to \( \mathcal{W}\)-formulas. Most of the work will be done by a helper function \(g\), which we define by recursion, starting from simple statements:

In the other direction, we show the contrapositive: suppose that \( f(\varphi) \) is satisfiable in \( \mathcal{W} \). This means that for some structure \( \mathbf{B} \) and some \( b_1, \ldots, b_k \) corresponding to the free variables of \( f(\varphi) \), \( f(\varphi)(b_1, \ldots, b_k) \) is true in the structure \( \mathbf{B} \). By how we've defined \( f\), we know that \( \pi(b_i) \) is true for each \(i\).

Now, the formula \( \pi(x) \) defines some subset \(Z\) of \(\mathbf{B}\), and the formula \( \rho(x,y) \) defines some binary relation \(R\) on this subset. We think of \( \mathbf{A} = \langle Z; R \rangle \) as a structure in the language of \( \mathcal{V} \); but is this structure actually in \( \mathcal{V} \)? Because \(f(\varphi) \) asserts that \( g(\alpha_1) \), \( g(\alpha_2) \), etc are all true, it follows that \( \alpha_1\), \( \alpha_2 \), etc are true in \( \mathbf{A} \). Since these sentences axiomatize \( \mathcal{V} \), we answer yes, \( \mathbf{A} \in \mathcal{V}\)!

But now, since \( f(\varphi)(b_1, \ldots, b_k) \) is true in \( \mathbf{B} \), and \( g(\varphi) \) is one of the conjuncts in \( f(\varphi) \), we know that \( g(\varphi) \) is true in \( \mathbf{B} \). Now another induction on complexity shows that \( \varphi \) is true in \( \mathbf{A} \). QED.

(Exercise for the reader: if you haven't done a proof by induction on formulas before, complete the details of that.)

Now, there was one misleading thing that I didn't quite come out and say, but pretty strongly implied: I made it sound like the reason we would find a reduction of problem \(P\) to problem \(Q\), is that we have (or pretend we have for the purposes of proving something) a solution to problem \(Q\), and use that solution to give a solution to problem \(P\).

That is certainly

*one way*to use a reduction. But, and this may come as a surprise, it's*far more common*to use reductions in another way: to show that problem \(Q\) is*hard*, because we can reduce problem \(P\) to problem \(Q\) and we somehow already know that \(P\) is hard to solve.If this sounds like nonsense, I have a different way to describe this idea. The original explanation I gave can be thought of as saying

if there's a reduction from problem \(P\) to problem \(Q\), then the difficulty of problem \(Q\) is an(whatever we precisely mean by "difficulty"). This new way of thinking about it then becomesupper boundfor the difficulty of problem \(P\)

if there's a reduction from problem \(P\) to problem \(Q\), then the difficulty of problem \(P\) is aSo they really aren't different ideas.lower boundfor the difficulty of problem \(Q\)

But one of the really annoying things about theoretical computer science is that a lot of problems lend themselves to having lower bounds proved about them in a very pretty, abstract way, while the way to show an upper bound is, basically, to write down the actual algorithm that achieves some upper bound on difficulty (again, whatever that word actually means).

In the current research project I'm involved in, for example, "easy" means "capable of being computed by an abstract computer". The question we're looking at has to do with when a universal-algebraic variety has an "easy" formal theory; however, if you look at the theorems we actually prove, they're almost all of the form "if such-and-such a structural property holds in an algebra, then the variety it generates has formal theory which is

**as hard as it logically could be**".Believe me, I'd love to have a theorem which shows that an interesting class has decidable formal theory. But those theorems aren't cheap. (As I mentioned, I spent most of last week's conference huddled with a senior colleague who thinks he has a road map for just such a proof... but right now it's just a treasure map with a lot of blank swaths.)

All of which is to say that the first example of a reduction that I wanted to share is one of this type: not one that succeeds in making a problem easy (by reducing it to a solved problem) but one that establishes that a problem is hard.

That's not the goal of this post, however. There's a lot of plumbing that needs to be done first -- not terribly exciting stuff, but without it you're not going to have happy guests. Specifically, I want to talk about the nuts and bolts of a formal theory, and how one would go about building a function which takes formulas from one theory and spits out formulas in a different theory written in a different language.

## A Crash Course on First-Order Languages

The strength of a formal language is, paradoxically, its weakness: there are some things which it's possible to express in that language, and there are often a lot of things that are*impossible*to express. The strength comes from the fact that (if everything is well designed) everything which it*is*possible to express is either true or false (at least, once all the statements are interpreted as referring to mathematical objects).What do I mean by that? Let's take a very simple example: the statement \[ x < y \]Now, as it stands, that statement is neither true nor false, but that's because neither \(x\) nor \(y\) stands for an object.

There's one more obstacle, though: not only are \(x\) and \(y\) currently "open" variables, but

*so is the symbol "*\(<\)*".*If I told you that you should interpret \(x\) as 7 and \(y\) as 17, then it'd be pretty safe to say the statement would be true. But what if I told you that \(x\) is the string "aardvark" while \(y\) is the string "blabber"? What should the meaning of "\(<\)" be? (I know of at least two reasonable interpretations: dictionary order, where "aardvark" does precede "blabber", and*prefix order*, under which*neither*string precedes the other -- they're just incomparable.)If we're going to strip everything down to the very basics, then, we start with symbols intended to convey basic assertions in our formal language. Traditionally, "=" is required to be one of these and all others are optional. Let's agree that our first formal language will have just one more assertion symbol, \(R(x,y)\). What does \(R\) mean?

That's the thing: we have complete freedom over what \(R\) means. Here are some possibilities:

- \(x\) and \(y\) are users of a network, and \(R(x,y)\) asserts that \(x\) can send data to \(y\);
- \(x\) and \(y\) are strings, and \(R(x,y)\) asserts that \(x\) is a prefix of \(y\);
- \(x\) and \(y\) are real numbers, and \(R(x,y)\) asserts that the complex number \(x + iy\) is a zero of the Riemann Zeta Function.

Such a definition (an underlying set, together with a specification of what our symbol \(R(x,y)\) means) is called a (first-order)

*structure*. I like to write a structure like this: \( \mathbf{S} = \langle \text{underlying set}; \text{relation symbols} \rangle \).And once we can assert a "simple statement", that is, \(x = y\) or \(R(x,y)\), we can assert a few kinds of

*compound*statement: if \( \varphi \) and \( \psi \) are statements, then- NOT \( \varphi \) is true iff \( \varphi \) is false, and vice versa;
- \( \varphi \) AND \( \psi \) is true iff both \( \varphi \) and \( \psi \) are true individually;
- \( \varphi \) OR \( \psi \) is true iff at least one of \( \varphi \) and \( \psi \) is true;

- FOR ALL \(x\), \(\varphi\) and
- THERE EXISTS \(x\) SUCH THAT \( \varphi \)

That's all the rules: every statement in our formal language gets built from simple statements and these five connectives. (For example, if we wanted to assert that some formula \( \varphi \) implies some other formula \( \psi \), we would write \( \psi \) OR (NOT \( \varphi \)), which is traditionally abbreviated \( \varphi \rightarrow \psi \).)

## Definable Subsets

Once we have this formal language, what might we do with it? Here are two possible related problems:- Fix a structure \( \mathbf{A} = \langle A; R \rangle\), and any subset \( Z \subset A^k \) for some \(k\). Is \[ Z = \{ z \in A^k \colon \varphi(z) \} \]for some formula \( \varphi(x_1, \ldots, x_k) \) in the formal language?
- Fix a
*family*of structures \( \mathcal{V} = \{ \mathbf{A}_1, \mathbf{A}_2, \ldots \}\), and any formula \( \varphi(x_1, \ldots, x_k) \) in the formal language. What can we say about the sets \[ Z_i = \{ z \in A_i^k: \varphi(z) \} \]Do they have some kind of common structure? Some interesting geometry? (Of course, we would need to have some notion of geometry in the original structures in \( \mathcal{V} \), but that's pretty common already.)

*defined*by a formula in the language is called, originally enough, a*definable subset*.As an example of the first problem type: let's go back to a set we mentioned before: \[Z = \{ (x,y) \in \mathbb{R}^2 \colon \zeta(x + iy) = 0 \}\]the set of zeros of the Riemann Zeta Function. Is this a definable subset? Of course, this depends on what our basic language is, so let's say that the language allows us the following simple statements: "\(x + y = z\)", "\( x \cdot y = z\)", "\(x \leq y\)", and "\( e^x = y\)". It is a celebrated theorem in model theory that, while we can say quite a lot in this language, our set \(Z\) is

*not*definable.## Semantic Interpretation

If you're still with me here, I'm amazed and grateful, because the previous few paragraphs are really the stuff of up to a month of work in a logic course. Let's get some payoff for that work!The setup: we have two classes, \( \mathcal{V} \) and \( \mathcal{W} \), each in a first-order language. (Note: probably

*not the same*first-order language.) For simplicity, let's say that the language of \( \mathcal{V} \) has just the symbol \( R(x,y)\), like we were using above, while the language of \( \mathcal{W} \) has other symbols, that we won't specify.Second, for technical reasons that I'll explain later, we need to assume that \( \mathcal{V} \) is

*finitely axiomatizable*(that is, there are sentences \( \alpha_1, \ldots, \alpha_n \) in the language of \( \mathcal{V} \) such that a structure is in \( \mathcal{V} \) if and only if all of the \( \alpha_i \) are true in that structure).And last, we assume that we have somehow found two special formulas \( \pi(x) \) and \( \rho(x,y) \) in the language of \( \mathcal{W} \), with the following property:

For each \( \mathbf{A} \in \mathcal{V} \), there isRemember what we said before about discovering the "structure" of definable subsets? Well, this is a special case of that question: if all the properties above hold, then these definable subsets in \( \mathcal{W} \) form

- a \( \mathbf{B} \in \mathcal{W} \) and
- a one-to-one function \(j\) from \(A\) into \(B\), such that
- \( \pi(b) \) is true (in \( \mathbf{B} \)) iff \(b = j(a) \) for some \(a \in A\), and
- for any \(a_1, a_2 \in A \), \(R(a_1, a_2) \) is true (in \( \mathbf{A}\)) iff \( \rho(j(a_1), j(a_2))\) is true (in \( \mathbf{B}\)).

*structural copies*of everything in \( \mathcal{V} \)!And now we can close the loop that we started at the beginning of this post: We're about to build a nontrivial reduction, from the set of formulas

*satisfiable in*\( \mathcal{V} \) to the set of formulas satisfiable in \( \mathcal{W} \). What this will show is that*if*we know that we can solve the satisfiability problem in \( \mathcal{W} \) -- that is, if there is an algorithm that tells us whether a formula \( \varphi(x) \) is compatible with \( \mathcal{W} \) -- then we can write an algorithm to solve that problem in \( \mathcal{V} \). Or, turning things inside out, if we somehow already know that the satisfiability problem in \( \mathcal{V} \) is*not*solvable by any algorithm, then we conclude that the same is true for \( \mathcal{W} \).To do that, we need to define our reduction: that is, our function \(f\) from \( \mathcal{V} \)-formulas to \( \mathcal{W}\)-formulas. Most of the work will be done by a helper function \(g\), which we define by recursion, starting from simple statements:

- if \( \varphi \) is \( x = y \), then \(g(\varphi)\) is \( \pi(x) \) AND \(\pi(y)\) AND \(x = y\);
- if \( \varphi \) is \( R(x,y) \), then \( g(\varphi) \) is \( \pi(x) \) AND \( \pi(y) \) AND \( \rho(x,y) \);
- if \( \varphi \) is NOT \( \psi \), then \( g(\varphi) \) is NOT \(g(\psi)\);
- if \( \varphi \) is \( \psi_1 \) AND \( \psi_2 \) (resp. \( \psi_1 \) OR \( \psi_2 \)), then \( g(\varphi) \) is \( g(\psi_1) \) AND \( g(\psi_2) \) (resp. \( g(\psi_1) \) OR \( g(\psi_2) \);
- if \( \varphi \) is FOR ALL \(x\), \( \psi \) then \( g(\varphi) \) is FOR ALL \(x\), \( \pi(x) \rightarrow g(\psi) \);
- if \( \varphi \) is THERE EXISTS \(x\) SUCH THAT \( \psi \), then \( g(\varphi) \) is THERE EXISTS \(x\) SUCH THAT \(\pi(x) \) AND \( g(\psi) \).

**Theorem**: \(f\) is a reduction from the set of \( \mathcal{V} \)-satisfiable formulas to the set of \(\mathcal{W} \)-satisfiable formulas. That is, if \( \varphi \) is satisfiable in \( \mathcal{V} \), then \( f(\varphi) \) is satisfiable in \( \mathcal{W} \), and if unsatisfiable, unsatisfiable.**Proof**: The direction "If \( \varphi \) is satisfiable in \( \mathcal{V} \), then \( g(\varphi) \) is satisfiable in \( \mathcal{W} \)" is proved by induction on the complexity of the formula \( \varphi \). The basic idea is to find a structure \( \mathbf{A} \in \mathcal{V} \), which is isomorphic to a definable substructure of some \( \mathbf{B} \in \mathcal{W} \).In the other direction, we show the contrapositive: suppose that \( f(\varphi) \) is satisfiable in \( \mathcal{W} \). This means that for some structure \( \mathbf{B} \) and some \( b_1, \ldots, b_k \) corresponding to the free variables of \( f(\varphi) \), \( f(\varphi)(b_1, \ldots, b_k) \) is true in the structure \( \mathbf{B} \). By how we've defined \( f\), we know that \( \pi(b_i) \) is true for each \(i\).

Now, the formula \( \pi(x) \) defines some subset \(Z\) of \(\mathbf{B}\), and the formula \( \rho(x,y) \) defines some binary relation \(R\) on this subset. We think of \( \mathbf{A} = \langle Z; R \rangle \) as a structure in the language of \( \mathcal{V} \); but is this structure actually in \( \mathcal{V} \)? Because \(f(\varphi) \) asserts that \( g(\alpha_1) \), \( g(\alpha_2) \), etc are all true, it follows that \( \alpha_1\), \( \alpha_2 \), etc are true in \( \mathbf{A} \). Since these sentences axiomatize \( \mathcal{V} \), we answer yes, \( \mathbf{A} \in \mathcal{V}\)!

But now, since \( f(\varphi)(b_1, \ldots, b_k) \) is true in \( \mathbf{B} \), and \( g(\varphi) \) is one of the conjuncts in \( f(\varphi) \), we know that \( g(\varphi) \) is true in \( \mathbf{B} \). Now another induction on complexity shows that \( \varphi \) is true in \( \mathbf{A} \). QED.

(Exercise for the reader: if you haven't done a proof by induction on formulas before, complete the details of that.)

## No comments:

## Post a Comment