What is Java's equivalent of the Python type function

Programming languages ​​and translators
03 - types

This lecture is the first part of our top-down look at programming languages, in which we pick out a total of four topics (types, names, operations, and objects) that appear in one form or another in every programming language. In the course of these considerations, you should learn that there are fundamental principles in programming languages ​​that appear again and again in different combinations and flavors. The engineer always builds what he knows, or he relies on proven concepts that are often based on a rich body of theoretical considerations.

In this lecture we want to deal with the aspect of types in programming languages. First of all we want to consider what types are, quite essentialistic, and what are the simplest manifestations of the concept "types". This includes, for example, the integer and pointer types, but also record records, or compound data types, occur in many languages. In C they are called structs. and array types. Building on this, we will look at the area of Polymorphism which is essential for our modern programming languages. With these polymorphic types we can define a potentially infinite universe of type combinations that helps to structure our programs properly and protects us from ourselves and our bugs. At the end of this lecture, we'll look at how the types we constructed in the previous chapters become one in programming languages Type system be combined.

The reason why this lecture on types is so exciting is that modern programming languages ​​such as C ++ or Rust have very sophisticated type systems, which to a certain extent form the backbone of these languages. Only with the help of these rigorous type systems, which are enforced by the translator before runtime, is it possible to design large programs with many developers at the same time. Types are a means of communication between an individual developer and the machine, and between the developers. In both cases, the programmer uses type annotations to communicate: "When I created this variable, I thought that the following properties and invariants should apply".

When the developer communicates with the translator, two aspects of types are in the foreground: the cross-cutting flow of information and making expectations explicit.

Figure 1: Cross-cutting flow of information on a tree

In the chapter on parsing we learned that we can extract the syntax of our machine programs context-free and display them as a tree. The big disadvantage of trees, however, is that they are strictly hierarchical. So if you only follow the edges of a tree, information can only flow between elements and their enclosed children. However, it is not possible for information to be passed through from the very beginning, to the very back, without visiting the root. This is shown in the slide example with the following: If one were only to follow the tree edges, we would know for and not which addition operation we should use exactly. In this situation, types come to the rescue: the static type annotation puts a context into the backpack for the variable at the point of definition (), which it carries around with it at every point of use (). The translator can then select the appropriate operation from this context established by the types. So we have set up a link that goes across the tree. We will see exactly what the rules for these links look like in the chapter on semantic analysis.

The second advantage of types is that they help us avoid bugs. Because in the backpack of the variable is not only which operations are attached to the variable, but also which objects are valid assignments for this variable. In the example, the translator can find out that we cannot assign one because dogs and cats are completely different types. With the type annotation, the programmer makes the assurance that this variable only ever saves cats at any point in time, over the entire runtime of the program.

The third benefit of types relates to communication with other people. Without specifying types, the reading programmer has to consider the type of the stored objects. It is much easier to understand a program with types that may even still have meaningful names, as in the example.

In summary, I would like to say: Do not see types and the type system of your programming language as an enemy that prevents you from expressing yourself, but as a friend and helper in need who prevents you from doing stupid things.

As a first step in our consideration of types, we will ask ourselves the question: "What is such a type anyway?". So what is the essence of types when we abstract away all the specifics of individual languages? What definition do we use to cover all types in all programming languages? However, there is not just one definitive answer, but three different perspectives (structural, denotational, and abstract), all of which are useful and illuminate different aspects.

Interpreting types is a very technical approach to the problem. It answers the question "What are guys?" so that everything is a type that I can build according to the rules of type composition. That's kind of like how we can say that positive integers are those mathematical objects that I can create by continually incrementing from that. Or that everything is a bread that I can make by mixing and baking flour with water (we will come back to the bread topic later, I promise!). In the case of types, this constructive argument then comes to conclusions such as "Pointer to" is a type because is a type and "Pointer to X" is a composition rule for types.

Form in structural interpretation Type expressions a central role. They describe how we can extract more complex types from simple types using Type constructors create. In doing so, the language has a lot of built-in Basic types (built-in types) that are already defined before a single character code has been interpreted. Often this initial population of the set of all types (the set T) already contains integer types, character types and a type for truth values.

Type constructors are then functions that take one or more elements of the type set as arguments and generate a new type expression that becomes part of the set T again. The inclined reader may be surprised that we do not specify the return value of the type constructors, but the expression that has not yet been evaluated. We do this because we are not interested in the concrete bit representation of types in a concrete translator, but how these types are formed. So we would write down our "pointer to char" from just now as. To make this clear, let's take a quick look at a piece of Python that we can use to look at this structural interpretation:

# Built-in TypesBOOL = 0 CHAR = 1 INT = 2 # Type constructor defPOINTER (pointee_type): return (100, pointee_type) defPAIR (T1, T2): return (101, T1, T2) defSET (element_type): return (102, element_type ) # Type expression print "char:", CHAR print "pointer to char:", POINTER (CHAR) print "pair of int and set of char:", PAIR (INT, SET (CHAR))

In this minimal type system we already see several things: the existence of built-in types, the presence of type constructors, and the difference between the type expression () and the encoding of the type as an object. In principle, we can already create an infinite number of types with this type system by using the constructors, like, over and over again. This results in type expressions that are nested deeper and deeper.

The structural view also means that we can paint these type expressions as a tree due to their nesting. They look structurally very similar to our AST from the last chapter; a fact that we will meet again in the chapter on semantic analysis.

The second lens on types is the view of types. With this in mind, we define a type as a predicate over all possible (imaginary and real) objects () in the universe. If the predicate says "yes", then the object is of this type, if it says no, then it lies outside the enclosed object set of the type. With these glasses, types are much more flexible than with the purely structural view, because we simply have to give some predicate in order to define a type. So the set of all even numbers can be of its own type:

defint_even (obj): iftype (obj) isintand (obj% 2) == 0: returnTruereturnFalseobjects = [1, 2, 3, "foo", True, [1,2,3]] # + range (4, 20) # Zermelo Fraenkel: [obj | obj <- objects, int_even (obj) print [obj for obj in objects if int_even (obj)]

Here we use the predicate to filter a list of objects for those objects that are of the corresponding type. Notice how we used the Python primitives in our predicate to derive our predicate from a base type.

If we write down the rules of a denotationally defined type in a more structured way than using a Python function, a translator can also derive properties of the object set from them. When adding two variables, the optimizer could know that the result will be the same again.

The third interpretation of types is point of view. A type is defined as a set of valid operations on objects of the type. This is where the idea of ​​the type context that we took up earlier (backpack) becomes particularly clear. The guy is its interface, no matter how this interface is implemented. An example for the abstractional definition of types are the interfaces in Java:

interfaceObject2D {floatarea (); } classMyMain {publicstaticvoidmain (String [] args) {Object2Dobj = ....; obj.area (); }}

Here we define the interface, which is determined by the method. Later on, we can create variables that can accommodate all objects that implement this interface. By using the interface type, we have the guarantee via the variable that there is a method that returns a floating point number.

With our three glasses we only managed one thing: We created types, but so far we have not been able to do anything with them. Pretty useless and lazy math objects, just let us create them wildly. Therefore, in addition to the type constructors, we now want to define further operations that work on our type set. Operations as useful as possible.

Probably the most important question for a translator is whether two are given types equivalent to are. So whether they mean the same thing. If two types are equivalent, their objects are indistinguishable from the point of view of the type system, completely compatible with one another and best-friends-forever anyway. Or to put it a little more carefully: type equivalence is the strictest relation between two types. There are two different types of type equivalence that can occur in programming languages: structure equivalence and name equivalence.

In the Structural equivalence two types are equivalent if they have the same type expression. If we use a type name (Pascal) in a program, this is just an abbreviation for the type expression on the right-hand side, the NAME itself is ignored in the comparison. Furthermore, different notations are regarded as equivalent as long as they lead to the same type expression.

However, this very technical type of equivalence has some serious disadvantages, which is why it is hardly used today. The first is that random, unintended equivalence can occur when two types accidentally share the same structure. The type and the type on the slides in Pascal, which uses structural equivalence, are the same, although the intention was quite clearly, through two type definitions, that the programmer meant different things.

The second problem is structural equivalence recursive data types. Such recursive types can occur when we can give types a name and use that name in the definition of the type itself. The prime example of a recursive type, which we already discussed in the parsing chapter, is the linked list, in which each list element contains a pointer to a (remainder) list. The problem arises when we simply want to replace each occurrence of the type name with the generated type expression, because then an infinitely deep nested type expression is created for this very simple list. An endless structure would be fine for the mathematician, but you have to use a few tricks The Equivalence of Modes and the Equivalence of Finite Automata to compare such infinite type expressions on a finite computer.

Because of these structural equivalence problems, most use modern languages Name equivalence. The idea is that every definition of a type, no matter how the type is structured, is a separate type that is only equivalent to itself. If the programmer takes the trouble to choose a name, then we should at least say that this guy is special!

Whenever a type name is used in a type expression, we only save the name there, not the assigned type expression, which means that our type expressions, even for recursive types, become pretty hierarchical trees again.

Together with name equivalence, the question immediately arises whether we Aliases of types: A type alias is a second type name that references the same type (not just the same type expression) and is therefore fully equivalent. In the C language, which uses name equivalence for records, you can create a type alias using a type alias. In C even the only way to define a new type name is see Lexer Hack ..

It is important to understand that they are all equivalent to the same type expression and make no distinction. This means that (on the slides) and is the same type (). Therefore, the translator cannot help us identify any misuse of. A similar mistake has already cost mankind a Mars Climate Orbiter.

One way to solve this in C is to use records with only one field. Since name equivalence applies, you can create non-equivalent types in this way, for which the translator checks whether we accidentally mix them up. For example, the Linux kernel uses a type definition for preventing accidentally incrementing an atomic data type and ensuring that the special API is used instead.

typedefstruct {volatilintcounter; } atomic_t;

Other languages ​​such as Ada not only allow you to create type aliases (), but also types that are not equivalent but have the same type expression can be created using.