Friday 14 November 2008

How to UTF-8 in JSP/Webapps

Although I never planned to write any code in my blog, there was a certain annoyance which I think deserves to be documented.
I am talking about UTF-8/Unicode in Java WebApps and JSPs. In my opinion this issue should have been resolved by default already 10 years ago when JAVA was still new so that we wouldn’t have to mess with this unnecessary problem.

Why Unnecessary

Because hardly any component during the development uses UTF-8 or other Unicode encoding by default and therefore has to be setup by hand! I have lost a couple of days because of this.

How to Fix it?

During the development of DITO these components gave me a headache:
  • Browser/HTML
  • JSP Encoding/Post Request
  • Database
  • JDBC Connection
  • Tomcat / Java File Encoding
  • Fiels and input streams
  • Console
Each of the following can be the cause if your Cyrillic, Greek or special Characters turn into “?”, squares or other rubbish. I will go through each of those:

Browser

Although I personally didn’t have problems with this using Opera, FF and IE it is still recommended to use these lines in the beginning of your JSPs.:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
I suppose most good Tools such as Netbeans, Dreamweaver and Eclipse will add these Lines In the beginning of your Code.

JSP Encoding/Post Request

Two lines which will already fix most peoples problems are those ones:
response.setCharacterEncoding("utf-8");
request.setCharacterEncoding("utf-8");   
They force Java to send and receive information using the UTF-8 encoding. The receive part comes most handy if you receive text using POST request. I recommend using post as the user can’t see it and UTF-8 encoding can cause you trouble – with old Proxy server for instance.

Database

I first discovered this problem when I was using Oracle Express I had the worst nightmare as Oracle by default uses the latin charset and requires a recreation of the database when changing to UTF-8. Too late if you have some data stored already. Unfortunately I haven’t found the command to change the charset.
Luckily most of you will use MYSQL which is much more flexible. Mainly you have to set the specific char rows to UTF-8. The easiest to do so, is using MYSQL GUI Tools which can be downloaded on the MySQL website.
screenshot

There you should set the default Character Set to UTF-8 when creating a table and verify this setting in every char/varchar and text row.

JDBC Connection

Even if both, Streams and the database themselves are set to UTF-8, the JDBC connection also has to be forced as well. This is done by adding a short addition to your JDBC URL.
jdbc:mysql://localhost:3306/mydb?useUnicode=true&characterEncoding=UTF-8

Tomcat / Java File Encoding

Although that’s one of the purposes of JSP, you have to be very very careful with hardcoded text. Most IDEs such as Eclipse and Netbeans need that you either set you project or your files themselves as UTF-8 encoded otherwise the characters get lost. Luckily both of these will popup error messages when trying to save files that contain unsupported characters.
This works fine when using standard JAVA files.
Unfortunately this does not count straight away for JSPs! The problem hereby is that your WebServer first converts the JSPs into JAVA files. If you are using Tomcat this can give you a big headache as Tomcat converts your nice UTF-8 files into Standard ISO .JAVA files and you will lose the characters. Even worse: I have not yet found a solution for this issue. In that case I recommend outsourcing the strings into a standard text file.

Files and inputstreams

When outsourcing your text out to text files, you will find new UTF-8 related problems. One could be your editor as Wordpad and Notepad tend to fuck up your encoding as soon as you store your file.In this case I can highly recommend Notepad++.
notepad++
This tool is free, fully supports most character sets and offers a vast amount of other features. But as usual this is not enough. When reading these Files you will also have to set your Java Readers and need to be set as well. This code shows you how to open Files using UTF-8 encoding.
InputStreamReader frIn;
  BufferedReader brIn;
  frIn = new InputStreamReader(
          new FileInputStream(fileName),"UTF-8");
  brIn = new BufferedReader(frIn);

Console

A confusing thing when debugging, can be your console. The problem is that the Windows CMD does not support Unicode. Furthermore the console within NetBeans also converts special characters into “?”. So instead of using System.out you will have to find another solution. The probably simples would be to print you stuff into the HTML document or to write into a file.

Conclusion

I hope that this will help some of you and maybe some developers will read this and make our live easier by finally moving over to Unicode.

Sunday 9 November 2008

Development Continues

I'm finished with my Degree, but that does not mean that the project is dead.

History - What is DITO

logo This project initially started as my Final Year Project in DIT where I graduated in June 2008 as a Bachelor of Science in Computer Science.

Development stopped with a prototype to demonstrate the main concepts. Since that I have basically shut this project down for a nice summer break and because of the lack of time since I started to work.
However, I think this project desrves to suvive and I forced myself ton continue working on it.
Screenshot of Prototype
screenshot

Work in Progress

Since the last prototype I have already done a bit work. This mostly concerned the database as I moved over from Oracle to MySQL. The reason for that was mostly financial.
Furthermore I slightly changed the database design as MySQL support a number of features that I wanted to take advantage of. In contrast to my time in DIT the main target was/is/will be the website. So I will concentrate on that for the next wile and probably continue working on the messenger after the relase of DITO. But it is still planned to be release.

What's next?

As next I have to add a number of features to the website to make it competitive with all those other social networks. After that I will have to work on compatibility and some nice Java Scripts.
The last step before a first release will be extensive testing and translation work.

Saturday 15 March 2008

Finishing with Coding soon

I know that my last post is already quiet a while ago (considering the circumstances) which does not mean I wasn’t doing a lot. I finished the Website with most of the Features I wanted to implement and made some testing. For the final tests I set up a server at my home which was accessible through dynDNS and let some of my friends register and have a look at the website and check out what mistakes they might find. At the end only a few minor bugs where left. Including a few Elements which are not clickable, a small Bug with String Translation and a few Strings which have not been localized. Though I know how to fix them I don’t have the time to play around with them. I will have a look again before the presentation to fix them.

Back to the Messenger

I will still work on the Messenger over the BankHoliday Weekend and try to get as much done as possible. Messaging finally works since today, though it is still a bit buggy. I will do my best to still implement a few more features like: Sound Output, usage of Hash Values for password validation and file transfer. Monday after I’m back from the St. Patricks Parade I will finish my work on the messenger as this date is my personal Deadline for Coding. From Tuesday on I will extensively work on the Dissertation.

Tuesday 4 March 2008

Bug Fixes and Home-Page

I was mostly working on the Social Network. I now definitely dropped the networks/groups feature as they are way too much work for a feature.

I now started working on the Home page and designed to Blog page. The rest of my work was dedicated to bug fixes and finishing unimplemented code like cookie handling or adding support for International Youtube URLs (didn't know that they actually use different domains for that).

As another part I installed Tomcat + Oracle on my old PC in Germany which I can access from the Internet through dynamic dns and port forwarding. The main reason for this is to make public tests in later stages and to check how easy/difficult it's gonna be to deploy the whole system. For now it worked fine except one point.
Oracle is (in my opinion) an extreme resource eater even if the database is tiny. This in the low specs and a low upload at home slows down everything.
The Specs:
CPU: 450MHZ Pentium III
RAM: 256MB (Main reason for slowdown)
HDD: 2x 12GB (In addition very slow ones)
NET: ADSL 6mbit/512kbit (low upload speed, but ping times are alright)

The main purpose of this is to make user acceptance tests and to get responses/suggestions from others. In a technical view it will also allow me to later check connectivity through a Firewall/Proxies though I doubt it since for security reasons at my home I only mapped Ports above 1024 as well as I am not planning anymore to implement HTTP communication within the Messaging app until April)

Next Steps

  • Finish Home Page
  • Finish Profile Page (most stuff around it works)
  • Make Buddies Page
  • Make Block User/Release Block page
  • Make Messaging Page - At least show the inbox and refer to messenger
  • Add String Tags in File Handling for Localization
  • Send out Localization File to translators
  • Testing

Time Organization

I have set myself a deadline of two more weeks (from last Sunday) for development. Basically I want to finish working on the Website next Sunday and work on the messenger for one more week. After that I will only work on the dissertation till the beginning of April when it has to be submitted.

Friday 29 February 2008

Working on Profile Page

The homepage really is a tough piece of work. Yesterday I finally finished the editProfile page and the registration.

I now started with the Profile Page itself. I had the design ready yesterday by 4AM. But the logic part is really tough. I have done the handling whether it is the logged in user itself and some more stuff. But it's hardly tested at all since I don't have the buddy stuff done.

Next steps:
  • Youtube Video Selection (Should be done within a few minutes)
  • Read out BuddyList and display of them in the Page
  • Work with the wall (Post Comments in the Profile)
  • Show Entries from the Blog (which currently doesn't exist)
Another Tough part will be probably the Avatar Upload, but I'm gonna see tomorrow. I'm just afraid that I will run out of time :-( I have to become faster, especially since the Messenger isn't finished. One nice Feature for the future I might still implement in the next days would be the selection of Profile Layouts since I managed to put a decent base for it.
Anyway, it's 3am again. Time to put in some last lines of code and go to sleep.

Monday 25 February 2008

Registration Initially works

It was much much more work than I expected, but initially the registration page works!!!
I had to do some more integration work of the database and localization class. Though it doesn't sound like, it was a lot of work I went through. Tomorrow I want to continue with the edit profile page to get the basics done and then continue with the profile itself.

After that I should go over to the Avatar upload stuff which could end up in a huge amount of work. But I'm positive about it. At least I'm now even ore satisfied with this huge amount of work I initially put into developing helper classes and stuff like that. I'm even able to reuse classes which where originally only planned for the server. With a few changes in some, since I have no control over the order of instances creates.

Well, it's 4:30am, so I'm closing down the stuff since I actually don't not want to skip classes tomorrow.

Sunday 24 February 2008

Productive day

Another important step done for this day (though the next one started since its 2:35) I finished the database connection pool and almost got the login mechanism finished. so for tomorrow I can go on with the registration page and have a look how far i can come. With some luck I will be able to view the Profile page which would be a really decent mile stone. Hopefully I can continue tomorrow without any surprises :)

Saturday 23 February 2008

Back on Track

After a forced break caused by an unexpected assignment i can continue again. Having lost 4 days have plenty of stuff to ketch up with. I am currently at a point where I started to depend on features that are out sourced to the website. Since I have kinda lost my focus to the messenger (thanks to that assignment) I decided to now work on the web page. Unfortunatelly there have been two issues which have basically driving me crazy.

Problem 1 (wasted 1 day on it)

I tried to develop the whole page how the W3C wants it to be with CSS and Divs only. The main layout went easy with Dreamweaver, but getting tabs done was one single nightmare! As I defenetely don't have time to play around with that I made this point with a "highly forbidden" table. But I think I'm not the only one:
http://www.randyrants.com/2005/04/css_sucks_bollo.html
http://www.decloak.com/Dev/CSSTables/CSS_Tables_16.aspx

Problem 2 (Wasted another day on that one)

I know packages a good thing and I use them extensively. But why the hack is Tomcat unable to find classes in the /WEB-INF/classes folder when they are not in a package? It was thanks to a friend who has given me the advice to try it with a package which has basically resqued a huge amount of my time.

What is done now?

Next to wasting time with those tiny little annoyances, I have also been working successfully. For example the main Layout of my website is done! And it looks pretty nice for that short time i have spent for it :) Tomcat is running, I even managed to get Eclipse working how I wanted it in less than 30 mins (was a bit afraid of gettings stuck). Since the class stuff is running I am also finished the localization class which allows me to use on single textfile containing all the Strings from a specific language to put them into the page.  The next steps are: Get the database Connection Pool running and get the user login/registration running. After that its gonna be filling up within contet. First main stuff like User Profile,... oh and not to forget the Avatar upload! <- Memory to me! I hope that I will be able to work decently fast since I have the ability to use plenty of classes developed for the instant messenger. The Localization module for example was done by modifying the Client Localization class from the messenger.
Maybe I'm gonna post some screenshots soon :-P

Saturday 16 February 2008

A decent amount of work done

So, I already managed a decent amount of work threugh the last days. Sending messages initially works, though I have to make more tests in this direction. Currently it works out very well if a contact is offline and the message has to be stored in the database. Loggin in also works fine now. The next steps now are the development Signing in and signing out. I also fixed some bugs with the Aspect ration and the loss of transparency when resizing images. I also found some decent ways to call the web browser which will allow me to outsource some stuff to the social network. Like the registration, buddy search and stuff like that. Even the database works now quiet well. The user interface also looks better and better. Currently I work on the sign in sign out function. I hope its not too long anymore and it's gonna be just adding new Features. I will also have to stop working a bit as Assignments have started again. Though I don't like this fact because it's putting away my attention. Here is just a small list of future steps within the messenger:
  • Allow Avatar change
  • Implement file exchange
  • Allow detailed Profile fiews
  • BuddySearch within the BuddyList
  • Social networking Features
  • Other stuff
So, I hope this will be finished soon. It's middle of february already.

Monday 11 February 2008

Database giving me a headache

I lost quite some time today only by trying to figure out how to make inserts into two tables having the same ID. The problem was that I am using some kind of inheritance for users. I am basically treating  Members, Chatrooms and Groups as some kind of users as all of them share some attributes. So I have a lookup table which records the users ID it's type and some common attributes. As a result of this, it was necessary to also create an entry into the Lookup table with the same User ID. The problem now was that in Oracle SQL it's almost impossible to efficiently insert into two tables having the same Attributes, especially with Sequences, as I haven't found any possibility to read out the current or following value nor avoid it being overwritten/changed between two inserts. The only way would be to read out the just written row, but how? Here I would possibly have to add another key or something else to quickly find this row. Also reading out the highest value userID would be inefficient, as I might have to run through thousends of rows and compare them. And again, how to guarantee that another Thread would't do the same? The Server is split into two parts which only communicate through the database. At the end of the day I decided to solfe the problem by generating random UserIDs and check whether this userID already exists. This allow me to add assure that the generated ID does not exist and that the same ID won't be used by another thread/process at the same time as the possibility is extremely low that the same ID will be produced by the generator. The next problem occured with Java. Programming in this language for such a long time, by accident I had to find out today, that Integers can not be set to be unsigend which reduced my number of usable chars down to less than 16bits. In this case I no made a difficult decision and change the format to chars from 0-9, a-z, A-Z.  I hope that there won't be anymore such suprises waiting for me.