Making a signal that aliens could understand

The movie "Contact" has an interesting description of how the aliens make contact with Earth. They have pulses in groups to get our attention. The number of pulses in each group is prime so we can see that it is unusual and worth devoting our attention to. Then they send back a plan for how to make a wormhole machine encoded as television signal along with the original television signal that they received. In the movie, the aliens have the advantage of having the television signal to use to know what representation. This got me thinking. How would I do it, without that advantage?

We don't know in advance how different the aliens would be from us. We would want to make as few assumptions as possible. Using prime numbers to get attention would be a good start. It would be clearly distinguishable from natural phenomena. But after that, it gets tricky.

I would use a ternary encoding. There would be a 0, 1, and pause. This would allow use to have separate binary sequences (like in morse code). Within this binary sequence, I would have Iota programs with 1 as * and 0 as i. The pause would indicate the end of the language. The 1s act as parentheses in indicating groupings. Thus, if there was a transmission error (quite likely at that range), it could easily be detected by that the parentheses would likely be unbalanced. This would also make it appear to have more structure than random noise. The language is one of the simplest languages. It has no extra operations. It doesn't even have the concept of numbers built in, however it is sophisticated enough that numbers can be described. In this sense, it has minimal axioms and thus should be easiest to describe without common ground.

To explain Iota to the aliens, many examples would be given. It would start with several examples of the K function. The K function has the property that Kxy→x. That is, it eats its second argument and returns its first. Then it would give examples of increasingly more complex functions. Each one would be evaluated step-by-step separated by small gaps. In between examples would be longer gaps. This should give them enough information to work out the rules for the language and to run its programs themselves. We could end this section with more fun sorts of programs like a program that returns a list of prime numbers, a quine, and an Iota interpreter. That gives us a starting ground from which we can describe mathematical concepts.

Of course, Iota is not practical for describing bigger, more complicated programs because of how it is exponentially longer than the length of the expression being encoded by more conventional means. Thus, it would be useful to describe a better encoding. After the Iota part, there would be a long pause. After that, we could give an interpreter for a more practical language, such as a minimal version of Scheme. Common functions and macros could be defined after the interpreter is given. After that, there would be some programs for it followed by their step-by-step evaluations.

After that, we could have programs to attempt to improve the quality of the transmission. For example, instead of having a raw bit stream, we could give reference implementations of compression (bzip2), error correcting codes (Solomon-Reed codes), and hashes (SHA-512). Each one could be followed by a more practical implementation and examples. This way, the transmission could from then on be more reliably received and it would be known if the received signal was intact or not.

All of these steps would provide a basis on which to communicate as well as a way to describe the protocols so communication can be efficient and reliable.