Skip to main content

OO in Language Design

If you were to create a new programming language, and you wanted to support object oriented programming with it, what would it look like? An object is a set of instance data and various functions that work with it. In Java, they're specified by first grouping the variables and functions together in a class, and then creating an instance from the class definition. In JavaScript, functions are dynamically attached to the instance data (or it's prototype). This is an intentional over-simplification but does highlight the fact that different programming languages approach working with objects differently. I've been thinking about objects and working with structured code in relation to Kiwi and it depresses me how much the object oriented school of thought that started with C++ and came to fruition in Java has influenced our industry.

Of course there are the so-called five concepts of object oriented design, which says a design that meets the following criteria is considered object-oriented (OO):
  • Object/Class
  • Information Hiding
  • Inheritance
  • Interface
  • Polymorphism

But I'm not so sure I'm convinced. A coworker accused me of trying to re-define it:

Coworker: So now we're redefining what OO is too... care to define common words as well?

Me: I'm not redefining it per se... just calling bullshit where I see bullshit.

Coworker: Excuse while I go outside and laugh loudly.

Me: Sorry my name isn't Martin Fowler so my opinion probably doesn't matter much, but I still smell bullshit.

If we think about programming in C for a moment, a language that isn't considered to support OO practices, we can still encapsulate our code and provide an interface for data manipulation. Member variables are placed in a struct, public methods are functions listed in a header file, and private methods exist only in the code file. Encapsulating in this case depends on an understanding of variable and file scoping rules of the language, not the existence of some class, public, protected, and private keywords.

We have to pass a pointer to the instance class into the functions, but that still happens in languages that are considered to be OO languages too; it's just syntactic sugar that automatically passes it and assigns it to this. It's difficult to ignore the similarity between foo.doSomething() and doSomething(foo).

Java has developers define the method functions in the same code unit as the member variables. Others allow defining them separately, and then associate them in some way later. C++ does this through file and namespacing, and Go does this with interfaces.

If we were to think of instances of objects in terms as payload structs, then is inheritance really necessary? We don't extend a class for example because we have a protected integer member we want to reuse, we do it for the functionality provided by method functions. Inheritance supposedly helps with code-organization and promotes re-use, but I'd argue not essential for OO. In fact, the proliferation of OO far exceeds the amount of code we actually end up reusing.

Polymorphism probably isn't really needed either for OO. Data typing is all about the underlying semantic meaning of data and what operations are allowed to be performed on it. What it means to "add" two strings together and what it means to "add" two integers together are different even though both are aggregate operations. Typing is less relevant in dynamic languages like PHP and Ruby. We don't care what type an object is so long as we can call a method on it. The concept of polymorphism is only important in statically-typed languages, and promoting it as a core OO concept means a true dynamically typed language could never be object oriented.

The way I see it, encapsulation begets abstraction. Inheritance is about reusability, not objects. Interface is a design pattern that attempts to overcome difficulty for reusing code from different objects. Pure object oriented programming seems really to be about encapsulation and message passing, while the concepts above seem more concerned about structure.

It can be argued that the software industry is in the mess it's in right now because of the over-definition and application of intolerant ideas about OO, and a refusal to acknowledge other schools of thought on the subject. But if I'm taking on the responsibility to create a new language, then I'm going to take the time to think about such things and not just follow the herd. I want to create something better, not just a better Java.

Comments

  1. I always appreciate your ability to focus on the core maxims rather than get distracted by petty implementation specifics. It allows you to stay open to core principles and approach problems as opportunities for perfect solutions. I think that if there's any "mess" in the industry, which I wouldn't go so far as saying "mess", it's because people are unwilling to try something new, be wrong, be humble about getting corrected, learn something new, and grow. Keep it up!

    ReplyDelete
  2. Polymorphism isn't just about data typing. I use that for functions all of the time. So that means that Polymorphism can work on dynamically typed languages, too. To me polymorphism is the strongest and best part of OO. Oh, and just because I change a function, doesn't mean I'm not doing OO. I've had this argument before. An object, is an object, is an object... just because one small exception is found, doesn't mean I'm forced to create a new object. I can keep the same object with a special exception as long as it doesn't go against my overall design.

    ReplyDelete

Post a Comment

Popular posts from this blog

Geolocation Search

Services that allow users to identify nearby points of interest continue to grow in popularity. I'm sure we're all familiar with social websites that let you search for the profiles of people near a postal code, or mobile applications that use geolocation to identify Thai restaurants within walking distance. It's surprisingly simple to implement such functionality, and in this post I will discuss how to do so.

The first step is to obtain the latitude and longitude coordinates of any locations you want to make searchable. In the restaurant scenario, you'd want the latitude and longitude of each eatery. In the social website scenario, you'd want to obtain a list of postal codes with their centroid latitude and longitude.

In general, postal code-based geolocation is a bad idea; their boundaries rarely form simple polygons, the area they cover vary in size, and are subject to change based on the whims of the postal service. But many times we find ourselves stuck on a c…

Composing Music with PHP

I’m not an expert on probability theory, artificial intelligence, and machine learning. And even my Music 201 class from years ago has been long forgotten. But if you’ll indulge me for the next 10 minutes, I think you’ll find that even just a little knowledge can yield impressive results if creatively woven together. I’d like to share with you how to teach PHP to compose music. Here’s an example: You’re looking at a melody generated by PHP. It’s not the most memorable, but it’s not unpleasant either. And surprisingly, the code to generate such sequences is rather brief. So what’s going on? The script calculates a probability map of melodic intervals and applies a Markov process to generate a new sequence. In friendlier terms, musical data is analyzed by a script to learn which intervals make up pleasing melodies. It then creates a new composition by selecting pitches based on the possibilities it’s observed. . Standing on ShouldersComposition doesn’t happen in a vacuum. Bach was f…

Reading Unicode (UTF-8) in C

In working on scanner code for Kiwi I did a bit of reading up on Unicode. It's not really as difficult as one might think parsing UTF-8 character by character in C. In the end I opted to use ICU so I could take advantage of its character class functions instead of rolling my own, but the by-hand method I thought was still worth sharing. Functions like getc() read in a byte from an input stream. ASCII was the predominant encoding scheme and encoded characters in 7-8 bits, so reading a byte was effectively the same as reading a character. But you can only represent 255 characters using 8 bits, far too little to represent all the characters of the world's languages. The most common Unicode scheme is UTF-8, is a multi-byte encoding scheme capable of representing over 2 million characters using 4 bytes or less. The 128 characters of 7-bit ASCII encoding scheme are encoded the same, the most-significant bit is always 0. Other characters can be encoded as multiple bytes but the mo…