Month: December 2005

Java Schools

Mail sent to Joel in response to this:

Ha!

I’ve been through this myself. Back in the mid-80’s I got a degree in Applied Statistics and Computing. There was an awful lot of numerical maths and heavy statistics as well as a firm grounding in structured programming. We were taught using Pascal, which did have pointers but nowhere near as bare bones as C. The year I left they took out a lot of the maths because people were whining – what did they think the course was going to be about with a title like that? Knitting?  I wish I had done a proper compsci course, all of my knowledge has come from reading books on compiler design and how to write better C, I bet none of these books are in print any more. Same with database design – no-one in the current crop can design a database properly or write SQL that gets data from more than one table without major brain surgery, the Java crew just use raw tables without proper relationships and then look on in surprise when their databases are full of junk orphaned data because of an application crash in the middle of a transaction (no, they don’t understand transactions either).

This stuff has been driving me mad for ages. One of the things you didn’t mention (maybe indirectly because of the sanitising of J2EE) is that Java generates an awful lot of frameworks. It drives me nuts trying to get anything done – each has another acronym and XML file, each is independent of the others, and each is useless, and each is embedded in every IDE you can use with no proper explanation of what the hell it adds to the mix.

I recently switched to PHP to get a personal project done from Java Struts. I got it done in about 2 weeks in the evenings and this included learning PHP and translating an XML config reader from Java.

Java is, in fact, too low level in many ways. Everything generic is type Object and then you have to screw around with it to work out what it really is. I’m currently studying for my EJB certification and it’s totally perverse. Badly-named packages with interfaces in different places, a query language for using entity objects together to construct queries (nasty and pointless), the list goes on.

I do think that in corporate terms Java is the new C, mainly because it’s become the language of assembling apps from other people’s components, and it’s also “safe”. I don’t like it any more, I did when it was new, but now it’s too clever (in the wrong sense) for its own good.

I agree totally about the int/Integer thing, as well as why they hell is there a char and a String? Why are there base types and then wrapper objects – why aren’t there just objects? Those objects can be implemented very efficiently with some syntactic sugar to make it easy but instead we have this deaf-mute child of C++ forced on us. The worst bits of C++ combined with the worst bits of Smalltalk. Why doesn’t the syntactic sugar around strings work for StringBuffer – which is the mutable version of String – lots of tiny objects in a large web app will start to fill the heap with loads of tiny thrown-away bits of strings. Using a mutable object could ameliorate this, but you’d have to have the insight to realise it and then rewrite all of the string manipulation using heavy-looking method calls, so no-one does.

My blog is on http://francis.blog-city.com – don’t worry, I attack C as well but for different reasons.

An interesting take on object orientation etc.

Have a look at this:

I sort of agree with him. See the comments on Java and whatnot I’ve been making here recently.

But I don’t see why we need to care about inefficencies in structures in ‘C’. If you care about this then use the alignment technique I gave earlier. I like structured data, probably because I’m a database programmer by training and inclination. I can’t stand writing code like this:

select a, b, c, d
into var_a, var_b, var_c, var_d
from bingo where …

gimme

select a, b, c, d
into bingo_rec
from bingo
where ….

This looks trivial. But you try it when there are 20 or 30 columns you are selecting and you (in a fit of editing) inadvertently transpose a couple of lines. Like I say, you have to be disciplined, but why be too disciplined?

Aother thing is I like syntatic sugar around things, I like using structures to pass structured data around. I know (I was trained to program in the 1980’s) that all of the data structures you’ve ever wanted can be modelled using arrays, but I don’t want to. I don’t want to have to think about silly bugs I may have put there when I reinvented my linked-list or stack as a home-grown array for the fiftieth time. A well-crafted utility library (with thousands of users) will be more reliable (eventually) than anything I write myself.

This approach to coding also doesn’t work well with teams and different levels of ability. To write code in the ‘C’ and no datatypes style requires a good memory and lots of discipline. This isn’t a bad thing, just very challenging to find enough people with that mind set (and talent, to be honest). Java is very much a big-corp, lots of specifications, language. It lends itself to that approach because of the built-in support for interfaces and polymorphism that object-orientation gives you.

It’s just that I’d like the whole purple spotted o-o bus to be optional (or gradual) when I need it.

Quick ‘C’ programming tip for aligning those pesky bytes in structures

Just a quick note. One of my friends, Roger, was on the ANSI ‘C’ committee.

One of the problems, if you have ever written an interface that either listens on a pipe for structured data or tries to read a packed binary file written by COBOL or somesuch, is that something like

1
2
3
4
5
struct record
{ char indicator[3]; 
int bill_amount 
/* etc. not sure of syntax these days */
} ;

If you try to read directly into this you will find that, even though the indicator variable is 3 bytes, the compiler will have allocated an even number or more, so that things fall easily on word boundaries. This doesn’t matter if you are writing a program that doesn’t need to talk to anything else. If you have a packed record where the fourth byte is indeed the beginning of the integer variable you are in trouble when you try and read it in.T

The traditional solution is to harass your compiler vendor until they tell you what the undocumented swiches are that make it not do word boundaries and a byte in memory is exactly a byte from your pipe or file or whatever.

In the ANSI standard, however, tucked away, you can do this:

1
2
3
4
5
6
7
8
/* assuming 4 byte int */
struct record
{ char indicator[3];
 int bill_amount;
/* etc. not sure of syntax these days */
 }
union
struct filler { char dummy[7] };

The compiler/linker is forced to byte align because the filler structure would be invalid. As I say, I’m not sure of the syntax these days but there is the idea in a nutshell. Beware that not aligning things can give you horrible errors on certain processors though.

Hope this save some pain out there.

Java is the new ‘C’ part 2

We had adjourned to the kitchen, it was done in some old mock style from somewhen. Not Victorian, but nearly, pretty ugly. I whisked the green tea and followed the ancient ritual. The master had never subscribed to the nonsense about coders being caffene-fuelled chocolate junkies. Brain-dead, tired, and fat, not to mention incipiently diabetic. The master could be hard at times, but fair. The hot tea steamed up my glasses, the spittle-encrusted coat was being dried on a chair next to the Aga. We sat quietly for a few moments’ contemplation. I noticed that his collection of pipes still adorned the mantlepiece. I could see his need for the demon weed pulling at him, but he firmly turned his back on it.

So, master, you say that we at least have two types now, ints and strings? Why is this bad?

It isn’t bad, my young friend, just silly. Who needs typed variables any more? A variable should be able to be whatever it pleases, and have a bath twice on Sundays if you wish it. To pass things around in Java you have to create utilities that use Object types and then remember what type they are, or you have to hide some array-like construct (say a Vector) in an accessor class that returns the correct type – Java 5 has some syntatic sugar for this, to be sure, but it always generates a compiler warning message, which really drives me crazy. Java 5 isn’t mainstream yet, all of your legacy code is still riddled with anonymous objects flying around – it’s a mess. It’s said to be flexible, but it’s just a lazy anarchy inherited from C++.

He sighed, and you have to wrap the base types in wrapper classes – again Java 5 has removed some of this by doing it automatically. But why has it taken ten years to take this implementation pain away? Why has it taken ten years to do some of this? Why isn’t there syntax for associative arrays? Because the vision of the language designers was clouded by ‘C’ and C++. C# is just a continuation of this as well, as far as I can make out. I don’t have time to learn the Dark Side as well as the Grey.

And that brings us to the vexed question of the absence of lambda functions and variables that point to functions – a piece of brilliance that goes all the way back to Lisp, even ‘C’ could do pointers to functions, I used it all the time, but of course it scared people who wanted to use ‘C’ to write COBOL.

I was puzzled. I thought you didn’t like ‘C’, master?

Did I say I didn’t? I just said you had to be very disciplined and test things thoroughly. This palace you sit in was built by ‘C’ and Unix, all I said was you shouldn’t take the crazed rantings about simple and confuse it with complete.

Master, surely anonymous classes are the lambda functions you seek?

He paused for a moment and cracked some walnuts with his new biceps, while he considered this. Then he smiled and offered me some. I declined for my biceps have not been subject to his punishing training regime.

Grasshopper, you are beginning to learn. That is the first original thought I’ve heard from you; keep it up. So we could define some aribitrary interface that allows us to define an aribtrary, anonymous object that allows us to create our anonymous function depending upon, say, an if statement?

I nodded. He sipped his tea and reached meaningfully for the sugar bowl.

Ah, master, again, no sugar. You can do it but it involves creating objects and interfaces that type those objects.

He smiled again, not touching the bowl after all. Unecessary complexity, functions can’t just stand out there in the naked wilderness and do their jobs, you are forced to go class class with extra fries and a milk shake to write 5 times more lines of code than you would in Python, PHP, or Lisp. Like everything in Java, you can do it but it hurts, it takes time and its a pig to maintain because the way you are using the construct is too outre for the masses that may have to look after your code when you have finished.

… you just try getting a comment saying this is a lambda function construct through a code review. He started giggling to himself and started on another chorus of ints and strings.

Let us pause here, for a little while, Grasshopper, and we will talk about Java and the Web. Why is it so slow to develop in and what can be done about it.

I sat and watched the steam rise from the tea cups, content to be silent for a while. I wondered, how do you use ‘C’ to write COBOL?

Java is the new ‘C’ part 1

When I next saw the master he was on a new health kick, gone was the pipe and the toffee tobbaco and in was the low fat food and exercise. He still managed to fix me with a rheumy eye though.

Grasshopper, it’s been a long time, many heads have passed under the surgeon’s knife since we last talked. What have you been doing, my young friend?

I swallowed hard and tried to look him in the eye.

I’ve been using Java, master.

Java? You’ve been using Java, eh? Have you learned nothing?He was not far from incandescent with rage, I had expected this.

I interjected – I need to earn a living, master, and Java pays the bills.

So? Grasshopper, my young friend, don’t you realise Java is the new ‘C’?

Master, master, it has no pointers. How can it be the new ‘C’?

Well, now, well now – remember how everything was a 16-bit integer?

Yes, master, then it became 32 and then 64 …

But still, an integer?

Yes, master.

In Java – everything is a String, immutable, heavy, can’t be extended. A string.

I tried to interrupt. Master – surely everything is an object?

Only sometimes, even then … what is XML? Long strings, what is a properties file – a bunch of strings – what do you spend all day doing? Throwing strings around – In your JSP’s and anything else – Strings. He shuddered. I have nightmares about them.

Master, what is wrong with strings?

They are not a type, but a class. They should have been a type, or everything should have been a class. When you write code you’d like to just use strings and not have to worry about the rules, the way you can use an integer or a float. It’s that, if I was designing a language from the ground up, I wouldn’t mess with such petty distinctions. AND, Grasshopper, AND strings are final, which means you can’t add to them, or you have to wrap them in another class of your own. They were afraid of security issues where people override certain methods and put in malicious code. They could have just made the standard methods final, but that would have been too easy.

He paused and drew breath, I wiped the spittle off my coat.

… and they’re immutable – so they consume loads of memory when you manipulate them and people don’t realise. The documentation says use the StringBuffer class, because it is mutable, but there are hardly any decent examples showing how to do this, and if you want to call a standard libarary function (or SAX or whatever you like) with StringBuffer – you need to call toString – which makes it all a bit pointless. And that lovely syntactic sugar using + to manipulate strings doesn’t work with StringBuffer – gah! Rubbish! Time-wasting rubbish.

Not to mention – he was flying now, eyes almost popping out of his head – why does Java have a char type and a string type. What nonsense is that? Why does it have any of the other types – the bytes, shorts and all that rubbish inherited from ‘C’? Why would you ever want a short integer? Running out of memory? On a modern machine with modern amounts of memory? They force you to learn all that irrelevant crap to get certified – I’ve only ever had use for ints and strings – maybe Java is slightly better, maybe it has two things to play with rather than one.

He started musing to himself. Singing ints and strings to the tune of Police and Thieves, although I’ve never seen either scare a nation with their guns and ammunition.

Java is also very low-level, that ancient crusty ‘C’ way of doing things, no sugar that can be efficiently compiled. You have to keep typing the same array walking constructs in again and again. There isn’t any syntatic sugar for things like lists and dictionaries and walking arrays of objects. Oh, yes, there are classes that support them, but no syntax. In a usable langage like python I can say something like

[mapping-expression for element in source-list if filter-expression]

This will give me a portion of a list depending upon the if statement. PHP has similar ideas (although not as condensed). Python will let me split lists, look at them from the end etc. etc. and it also has dictionaries and syntax for them as a base type. Java gives me 200 different implementations of the dictionary interface – why should I have to care? Blinkin’ computer scientists, too clever by half.

The fire was really glowing in his eyes now. I was beginning to worry about ever getting my coat dry again.

And I still hate {} to delimit blocks of code; if should have end if, loop should have end loop, there’s a whole class of stupid errors that can’t happen if you do this. Again, Python uses indentation to do this – so it’s clear what goes where.

To me this is a flaw with PHP, in that it has too much ‘C’ heritage but its way of doing class- and instance- variables is vastly superior to Java:

self::$var ; // This is a class variable
$this->var // This is an instance variable

To use a global in a function you have to declare it

global $var ;

In the method that you want to use it in. The syntax is a little clunky but at least you can tell these kinds of variable apart, a whole class of bugs never appears. And the object stuff in PHP and Python is optional, you don’t have to worry about it if you just want to write a quick script to get something done.

I sighed and said, would you like me to make some tea, master? It was going to be a long day.

More from the master

Cohesive Libraries

http://www.regdeveloper.co.uk/2005/12/11/cohesive_code_packages/

Kevin’s responses are posted here in black

I know what you mean about the java.util package.

On a slightly related note I’m studying for the Business Components Developer exam and you need to learn serveral interface specs.

Not a problem.

Then you need to know which comes from java.rmi., and which from javax.rmi..

Ah, an unnecessary subtlety.

Why have the same thing twice – why is one object extended and the other not? Why?

The Java libraries are riddled with this and it makes passing the certifications unecessarily hard. I want to slap whoever did this but have to control myself.

Yes, it appears to be a fact you have to know for the fact’s sake. In other words, a poor decision on cohesion has been elevated to “clever” status.

At the feet of the master

This is a reprint of something I wrote ages ago. Thought it might amuse. I’ve cleaned up the swearwords – you can guess them back in if you care.


Ah yes, indeed. The old programmer sucked contemplatively on his pipe, waving the bowl dangerously close to my ear. In my day you had to watch for them, the little dears, they’d have your hand off as soon as look at you.

He was talking about the elusive wild pointer in its many guises.

In the beginning there was C, a stupid language that had its place in the pantheon but it was basically jumped up assembler aimed at some abstract machine. The promise … he coughed alarmingly for a while, waving fragrant smoke across the room. Now there was no cancer a lot of people had started to smoke again – it was a sign of affluence and power instead of stupidity. He watched the colours of the spots dancing in front of his eyes, with a rheumy intensity.

The promise that the code would run anywhere if you recompiled it because it was all aimed at the same simple minded abstract machine.

I interjected – this was true wasn’t it?

He stopped and regarded me over his meerschaum, the smoke falling down over the rim – only like it is true that all books are made of paper. People would do things like attempt to go back to the beginning of a file that was a terminal session, they would write code that couldn’t be bothered to check for errors and the language didn’t have any error handing except for some junk added involving global variables that you had to keep testing and no-one could be bothered. No exceptions, and, of course the curse of programming, the pointer.

You could recompile something and it would work. But once the number of lines of code (not including comments if you were lucky enough to have some) got over about two hundred it would start to fall to bits. Then you had to be disciplined. The language was so awful that people invented naming conventions for variables to check for type collisions – imagine – programmers checking because the compiler could not! People doing mental drudge work – I mean, what the forage are computers for? Everything is a sixteen bit integer except when it really matters and then your program won’t work and you can’t find out why. Don’t forget, either, that before this abortion there had already been fifteen years of research, LISP machines existed with fully integrated, extensible programming environments – all the things you take for granted in your “modern” sexy IDE were already there in essence, years and years ago.

So we went back to the stone age and lost those twenty years. C was responsible for Unix, like Unix it was obsessed with simplicity and being clever at the same time. So we had line-mode text editors with substitution commands that were incredibly powerful. We had a replaceable command shell that did weird things like expand all of the command line arguments for you so you could never check things like did you really mean to delete all files in this directory? The call, from the C and Unix crowd was all about being simple.

My eyes started to glaze over, I had heard the Master on this topic before, simplicity is all right but things must be complete also or you are wasting your time.

He flicked me on the nose, and the pipe got rather too near for comfort. Listen, and I will initiate you into another great mystery: simple is not correct but it can give the veneer of correct. Simple is very very dangerous in the wrong hands.

His pipe had gone out, thank COBOL, but he still had the wit to relight the thing.

Simple is where the wild pointer comes from. He sucked industriously and the flame went up and down with his eyebrows. It’s easier to implement a language if you don’t have to create things like strings as a proper type, instead have a pointer to a place in memory that contains the strings and write a couple of libraries if you feel generous that will allow you to do things like compare them and copy them (copying came later after everyone had written their own bug filled routine). In the same vein an array and a pointer are the same thing, a string is an array of characters, and everything is a sixteen bit integer anyway. Just for laughs make the library routines have completely unmemorable names and slightly different, but very simple, conventions. Multidimensional arrays are arrays of pointers; so you’ve got pointers to pointers ad infinitum. Then add in the fun-filled idea that a comma is an operator so an inexperienced Pascal programmer will not use a[0][1] but a[0,1], which is valid but not what was intended – who knows what it means? I can’t remember. Oh yes, arrays start from 0 too. All this is post Pascal, which was a good attempt to make life easier for the programmer and simply didn’t have all of these weaknesses, but it was meant for teaching and didn’t have the low-level stuff in a standard form.

But the consequences of such glorious simplicity, grasshopper, the consequences are bad for all momma’s li’l chil’n. Unless you are a black belt computer science type you won’t understand half of this and not have the discipline necessary to make it go.

So what happens, master? The pipe was out and I was saying nothing – he will insist on smoking toffee tobacco, which smells of burnt sugar.

People write simple programs that work. But they aren’t correct, they don’t check for errors because the language doesn’t help you do this easily and it isn’t part of the culture. Some compilers will initialise variables and some won’t. Code will go across machines from different vendors without any problem. But it is still incorrect.

The wet end of the pipe came out of his mouth.

Then you fill up a file system and the machine crashes badly and destroys a load of work, and no-one knows why. Then you have a pointer you allocated and deallocated memory to without setting back to the empty value that messes all over something you wanted to keep, but only sometimes, because you needed to reinitialise it but had no way of finding this out. Then you have a situation where your program fails, even when it seems OK, and it can be fixed by you changing the order you declare your variables. But is it then correct? Do you trust it?

Underneath everything is an integer that is cast into different roles depending upon what it is supposed to do. This means that everything is nothing, everyone has to think like the foraging compiler. How many people are that good? Not many, and even the good ones need time to make their mistakes and learn. Even the good ones can be wrong sometimes. Every crappy little four line routine manipulating strings needs to be thoroughly tested and you know in your heart that someone already wrote the same routine a thousand times.

Is it any wonder software is so expensive to write?

More from the master