Java Master Ep.2 – Strings Manipulation Principles | Coding Interview


Strings are the core and one of the most used classes in any Java application. We will take a closer look at some of the principles and the rules that govern the structure. We’ll start with their nature and how does Java actually manage these types of objects internally? The most desirable ways of creating larger strings will get covered next, followed by the secrets of the password’s storage and management. So that we keep our sensitive data safe and sound. Almost every application has more than a single language for the user interface. We will go in depth on how to manage those multi-lingual scenarios. After that, we’ll try to take advantage of existing libraries for the most common string manipulation operations. Finally, we’ll dive into various performance statistics that will help you choose the right path when faced with various string processing situations. If you find the content to be of value, then click the subscribe button in order to get notified about any future content. Also, if you want to expand your knowledge about software craftsmanship techniques and automated testing, then go ahead and check out my web site. Now, all the time stamps for each of the chapters are in the description box. If you want to head straight to the one that is of interest to you. Now without any further ado, Rol it. Strings are quite a specimen when it comes to Java ecosystem. They are one of few objects that have their own unique nature and behavior. It is paramount to have this knowledge instilled in our mind before we get into any serious coding. We’ll start by trying to uncover the basic nature of strings in Java. All that with the help of our hero who will be trying to upgrade his armory. First hero writes down all the necessary equipment upgrades that he would like to do. It has been a while since he has done that and his current gear starts to get a little rusty. Once he arrives at the shop, he gives the list to the merchant who tries to find the pieces in his current stock. If the name fits the values are aggregated. Finally, at the end, a receipt is produced based on a total worth of the gear. Some of you may already notice that this method may not work as intended all the times and is not optimized. Also, before we identify the faults, let’s try to uncover how strings are managed on the java heap. When we create a string using a literal for the first time, an object is created on a heap. If we use the same literal in any other place in our application, no new object will be created, but the existing one use instead. That is why when we compare these two variables with an equals operator, you get a positive. Moving forward. Next, we try to create the same string, but we use the constructor this time. Unlike the literals, a new object will be created on the heap every single time, regardless of whether an object representing that string has already been created before or not. This means there is a danger of using tons of memory if some of the strings are used very often in our application. Also, if we try to compare the constructor created string with literal crated string, we get a false result this time. OK, having all that in mind, we can quickly identify the problematic parts of our methods right now. Now we know that the equals operator comparison might simply not work all the time. Even if the actual strings are the same objects maybe not. That is why we always should use the equals method in order to compare the string objects. In addition to that we should begin the comparison of the string that is a constant, not a variable. We have an ENUM entry here. So starting with that is a lot safer as we might get a null pointer exception if we were to start with a variable. Finally we will replace the amount string created with the constructor with a literal. Thanks to that every unique amount will be kept only once memory and will never be duplicated, which might cause a problem if our game is successful and is played by millions of players daily. Probably the most common operation that has performance strings is the concatenation operation. Sometimes you just combine a couple of strings, which is a very innocent operation. Sometimes, though, it is hundreds or even thousands of strings that need to be concatenated. You should be very cautious in these situations and pick just the right strategy. In this example, the citizens of a nearby kingdom have gathered in order to thank the hero for slaying the dragon, that has been haunting nearby villages for years. Apart from receiving the gift and string it in the treasure chest, he would like to keep a log of all the gifts he has received. I guess just in case his accountant will complain about the hero’s tax return at the end of the year. So the way he does it is by concatenating the strings using the plus operator. This will work, of course. We might be missing on the performance though, especially if the hero receives thousands of gifts in our online game. In total, there are thousands of heroes going through the same process. Let us get in detail of what is going on here and what alternatives do we have. When we concatenate strings in a loop, what happens under the hood is actually a new string is being created every time a concatenation happens. This is because strings are immutable and we cannot modify them. With a large loop the amount of objects created is something most likely not acceptable. This is where a string builder shines and comes into the rescue. The way this structure works is that every time we try to concatenate a string, we do not create any new objects on the heap. StringBuilder keeps only one object and one reference to it. That’s it. An important thing to keep in mind is that most of the current Java compilers will replace the plus operator concatenation with a string builder alternative. Implicitly, this only happens though for simple concatenations. Not ones that happen in a loop. Having, this crucial knowledge in mind we can easily go back, to our hero receiving his gifts and make him use a StringBuilder instead of a straight on concatenation to release a bit of memory burden from the application. Finally, in the end, let us compare a bit, the ways we can concatenate strings. First, we have the simple string concatenation that we should only use for mostly fixed content, which is small and does not receive changes very often. Then we have the good old StringBuilder that we should use in most of other scenarios. With the exception of cases when we need to make sure our concatenations are thread safe. This is where another structure comes into play. StringBuffer. This one has exactly the same features as StringBuilder with additional thread safety implemented under the hood. For this reason, its performance is significantly lower and should be used with caution. Every application has its fair share of sensitive data processing and storage. If we use string objects in order to transport this type of data, we might not protect them from the most skilled hackers. This authentication method seems to look simple and straightforward. We first encrypt the password that is submitted by the user on the h._t._m._l form. Then we try to find the combination of log in and an encrypted password in our database. It is straightforward, but there is something dodgy about it. If you followed carefully all the examples from the start, you should notice a small detail that might be a threat to our application. When the password entered by the user is passed into the method, a new string object is created on the heap. Then we encrypt the password before passing it for verification. Java creates another string object on the heap. Now, after the method finishes, these objects will remain on the heap until they are garbage collected. Now, we do not have any control over when that happens. Maybe after a seconds, maybe after an hour. During this time they are there. This gives a window of opportunity for someone to hack into the virtual machine and steal these passwords that are just floating around. There is a well known solution to this pickle. It is through the usage of char arrays. This solves our problem as Char’s being a primitive type are not stored on the heap. Thanks to that, there will be no trace of the operations once you’re done with it. If you want to go global with your kickass program, you will have to use internationalization at some point, that’s inevitable. Just be smart about it without the need of duplicating tons of code just to satisfy yet another language. Our hero has had quite a successful streak lately. He has been winning with other knights and also slaying all the monsters on his path. Various cities started to invite him as a motivational speaker to build up the morale of common folk. Before he would start his speech, he would ask about the language of the region he’s currently in. Then he would need to go through all of his notes to find the ones that fit with the language he needs to speak in. The crowd is getting anxious and our hero just keeps dropping his notes everywhere. And it is just a mess. Just like this code over here. We should help our a hero a bit so that he would not need to carry tens of different versions of his speech, just one that he can memorize. Well, this looks more like it. All we need to do right now is to make sure we get proper language details. Then we tell our master translator to which language he should translate our hero’s speech and get into it. Code wise, we tell Java which property file containing language specific texts should be used. Our hero can speak always the same and our translator will take care of the right message for us. Now our hero can do what he is best at and forget all that lousy paperwork. Keep in mind that 90 percent of operations on strings that are present in your application have been already done thousands of times before by other developers. On top of that, these operations have been perfected and optimized. Why not be smart now and then and take advantage of that walth of code? This time, we will try to load a previously saved game. In the beginning we do the usual environment state, heroes state and the mission state loading. Then we also want to load a personal journal of the player that he has been filling out since he started. This journal unlike the rest is loaded from a separate text file stored on a system in the usual way. We would need to do something like this at least to get all the content we want. This just sucks. We have a nice way of loading all the states and then we have this brick in the middle. It just looks bad. Let us save ourselves a bit of headache and take advantage of one of the available utility libraries that are available for Java standard implementations. Now our code looks a lot cleaner and we can make it even better by moving the entire try catch block into a private member. Then our method would read literally like a chapter of a book, and that is what we always want to strive to. Most of less to mid-complex processing that you need to perform on strings in your application has been done thousands of times by other people before. Thus, most of the standard libraries have that covered for you. You just have to use it. I would personally start with the Apache Commons libraries, then go over Spring or Guava or libraries alternatively, if Apache does not support what you need. But of course you are free to use any library you wish. As long as it works for you and your team. As a cherry on the cake, we’ll look at various performance measurements comparing the most common operations and strings to finish off our today’s topic. Keep these in mind, especially if you’re planning to go for an interview to get that lucrative Java contract you have been hunting for. They might ask you about the conclusions you are about to see. In the first round we will get to see the constructor versus the literal string creation face off. We are going to use the standard jdk benchmark tests with the sample of 10000 runs. The results are quite expected and string literal creation beats the string constructor times quite considerably. So always prefer a literal creation over a constructor unless you know what you’re doing. And it is deliberate. In the second round, we’ll have a clash of various concatenation strategies available to us. We’ll have the plus operator, concat method, and finally, the good old format method. The results are all over the place with the format method being an absolute loser and the addition operator being the absolute winner. Most likely because of the fact that most compilers changed the addition concatenation with the StringBuilder implementation. Let’s keep in mind these numbers next time we need to concatenate simple strings. In the final round, we’ll bring StringBuilder and its older brother StringBuffer into the ring. We’ll do some simple appending a couple of thousands of times and verify the results. And they are, as expected, in favor of StringBuilder, not by some crazy extent, but enough to make it the winner and go-to implementation in most of the cases except where thread safety is required. After today’s episode, you should know how the string objects behave on Java Heap and how to use that to your advantage. You should also be able to differentiate between StringBuilder and string buffer and when is the time to use them? Internationalisation and password processing also should be no mystery to you. I hope you enjoyed today’s episode and I hope it has added a few ideas to your software craftsmanship toolkit. And remember, programming is supposed to be a creative, fun and fulfill experience. All you software artists out there. I’m looking forward to seeing you again on our next episode.

Leave a Reply

Your email address will not be published. Required fields are marked *