Trying to do good for the wrong reasons

2009-01-10 17:42:05 +0000

Many programmers with at least some education computer science code by a few principles:

  • Encapsulate database access, using Data Access Objects so we're not tied to one or another database
  • Use Model View Controller pattern, so we can change the view logic easily without having to change the entire software.
  • Use Facade beans and Remote Beans so we can separate our application server in a frontend-server and business-logic server.
  • Know the book of GoF by heart and use their patterns, so our code is good.

However that's all wrong! The principles are good, when applied correctly. However the reasons are all wrong resulting mediocre software. If in any project changing database is from for example from MySQL to Oracle or changing view-layer from JSF to GWT is a possibility, then that software-project is very bad managed. The ability to change a product shouldn't be the reason. And if people think they can have their software be more robust are perform better by setting up an extra business-logic server, they'll be disappointed.

Because people often use the wrong reasons, a lot of code is overly complicated, badly performing and highly unmaintainable. You known something is wrong if you have to wade to a lot of interfaces, Impl classes, XML-configuration files just to known what a piece of code is doing

Naturally the principles are good, however, the goal is totally different.  The reason for all of the principles should be: program correctly functioning code that can be adapted easily now or in the future.
Design patterns isn't a bible

Known that any book on design patters shouldn't be used as a bible. Wrongly applied design patterns produce terrible code. The book of GoF was a great book because they (Gamma et al.) introduced the concept of design patterns and gave a list of good design patterns for C++ and Smalltalk. Many of the design patterns are relevant, but many are also not.
Design Patterns
Design Patterns
Erich Gamma & Craig Larman

Most important contribution of the book, is the notion of design patterns. They introduces common words, a name, like Data Access Object, Visitor, Singleton to programming constructs. Because they gave name to certain programmings patterns, people can talk about them and understand each others code. Similar like architects use various patterns to design houses, office buildings, public buildings or monuments. The origin of design patterns was a book about architecture:

The Timeless Way of Building
The Timeless Way of Building
Christopher Alexander

There are endless design patterns besides the one listed in Design Patterns: Elements of Reusable design. Every programming language, framework and software project and application has its own patterns. Recognize patterns, give them a name if they haven't got one. Make sure the patterns and the naming are known within the company, organization or all of the users of a framework. That way, the code will be easily to grasp for anyone who has modify or extend the software, because of common naming and usage.

Great coder

Read who a  great hacker is, how to become better programmer read what you can do to be a better coder. The greatest cause why software development is so expensive and costly, is because there's enough emphasis within many organisations on writing good code.

Read more

Parallel execution of SQL

2009-01-02 18:10:45 +0000

The end of increasing clockspeeds

The clock-speed for processors doesn't increase anymore due to physical limits. The only way for processor-makers like Intel and AMD to increase the speed of a processor is to add extra cores. Making software scale near-lineair in relation to the number of cores (meaning: doubling the number of cores should result in a doubling of the performance of a program) is quite hard with the current way of programming done by most programmers/architects.

Parallel execution
Fortunately there's one language which is very well design to be executed in parallel querying databases using SQL.  Running queries in parallel means the query is split up in subqeuries. Each of those queries is run in parallel and then combined.
For parallelism to have to have advantage, there has to be either a database server having multiple cores or CPU's or the database has to run distributed on multiple nodes (servers/computers). In the latter case, when running on multiple nodes, there's has to be a very fast network connecting these nodes. Also the database-server software has to support running queries in parallel. Well-known database-software such as Oracle can do this out-of-the-box.

When you have an application that is used by a lot of users doing much of the same at the same time - more than the number of nodes and cores-per server - then running queries in parallel doesn't have any advantage.  However, when a small number (or just one) users are running heavy queries - less than the number of nodes and cores-per-server - running queries in parallel can have great advantages.

A SQL statement like SELECT * FROM customer, and especially aggregation functions like SELECT avg(grade) FROM students or SELECT max(count(*)) FROM student GROUP BY class. Inserts can run parallel too, although that'll only have a slight advantage when inserting a lot of data at the same time.

Modern DBMS (Database Management Software, usually called database-software or simply databases) can run queries parallel on multi-core systems out-of-the box, like the well known DBMS of Oracle. Running queries on multiple-nodes with software such as Oracle RAC. For a detailed explanation on parallel execution in Oracle see the whitepaper Oracle SQL Parallel Execution , or a posting of Don Burleson for a short explanation.

In short, a modern DBMS can do a lot more then you may imagine. Smart SQL can solve quite some problems in concurent programming and optimazing software for multi-core or multi-node environments.

Read more

Archive

subscribe via RSS