Wackiness With String, GString, and Equality in Groovy - No Fluff Just Stuff

Wackiness With String, GString, and Equality in Groovy

Posted by: Scott Leberknight on March 20, 2008

We ran into this one today while writing a Groovy unit test. Check out the following GroovyShell session:

groovy:000> test = []
===> []
groovy:000> (1..10).each { test << "Item $it" } ===> 1..10
groovy:000> test                               
===> [Item 1, Item 2, Item 3, Item 4, Item 5, Item 6, Item 7, Item 8, Item 9, Item 10]
groovy:000> test.contains("Item 1")
===> false

Um...what? We started by adding items to a list, e.g. "Item 1", "Item 2", and so on. Then we simply ask whether the list contains an item we know (or think) is there. But no, it prints false! How can test possibly not contain "Item 1?" First, what's going on here? Look at this second example:

groovy:000> test = []
===> []
groovy:000> (1..10).each { test << "Item $it" } ===> 1..10
groovy:000> test
===> [Item 1, Item 2, Item 3, Item 4, Item 5, Item 6, Item 7, Item 8, Item 9, Item 10]
groovy:000> println test[0].class 
class org.codehaus.groovy.runtime.GStringImpl
===> null

Ah ha! The items in the list aren't strings. So what? Look at this code:

groovy:000> a = "Item 2"
===> Item 2
groovy:000> a.class
===> class java.lang.String
groovy:000> n = 2
===> 2
groovy:000> b = "Item $n"
===> Item 2
groovy:000> b.class
===> class org.codehaus.groovy.runtime.GStringImpl
groovy:000> a == b
===> true
groovy:000> b == a
===> true
groovy:000> a.equals(b)
===> false
groovy:000> b.equals(a)
===> false

a is a normal Java String, whereas b is a Groovy GString. Even more interesting (or perhaps confusing), a == comparison yields true, but a comparison using equals() is false! So back in the original example where we checked if the 'test' list contained 'Item 1', the contains() method (which comes from Java not Groovy) uses equals() to compare the values and responds that in fact, no, the list does not contain the String "Item 1." Without getting too much more detailed, how to fix this? Look at this example:

groovy:000> test = []
===> []
groovy:000> (1..10).each { test << "Item $it".toString() } ===> 1..10
groovy:000> test
===> [Item 1, Item 2, Item 3, Item 4, Item 5, Item 6, Item 7, Item 8, Item 9, Item 10]
groovy:000> test.contains('Item 1')
===> true

Now it returns true, since in the above example we explicitly forced evaluation of the GStrings in the list by calling toString(). The Groovy web site explains about GString lazy evaluation stating that "GString uses lazy evaluation so its not until the toString() method is invoked that the GString is evaluated."

Ok, so we now understand why the observed behavior happens. The second question is whether or not this is something likely to confuse the You-Know-What out of developers and whether it should behave differently. To me, this could easily lead to some subtle and difficult to find bugs and I think it should work like a developer expects without requiring an understanding of whether GStrings are lazily-evaluated or not. Lazy evaluation always seems to come at a price, for example in ORM frameworks it can easily result in the "n + 1 selects" problem.

For comparison, let's take a look at similar Ruby example in an irb session:

>> test = []
=> []
>> (1..10).each { |n| test << "Item #{n}" } => 1..10
>> test
=> ["Item 1", "Item 2", "Item 3", "Item 4", "Item 5", "Item 6", "Item 7", "Item 8", "Item 9", "Item 10"]
>> test.include? "Item 1"
=> true

Ruby "does the right thing" from just looking at the code. Ruby 1, Groovy 0 on this one. I don't actually know whether Ruby strings evaluate the expressions immediately or not, and should not need to care. Code that looks simple should not require understanding the implementation strategy of a particular feature. Otherwise it is broken in my opinion.

Scott Leberknight

About Scott Leberknight

Scott is Chief Architect at Near Infinity Corporation, an enterprise software development and consulting services company based in Reston, Virginia. He has been developing enterprise and web applications for 14 years professionally, and has developed applications using Java, Ruby, Groovy, and even an iPhone application with Objective-C. His main areas of interest include alternative persistence technologies, object-oriented design, system architecture, testing, and frameworks like Spring, Hibernate, and Ruby on Rails. In addition, Scott enjoys learning new languages to make himself a better and more well-rounded developer a la The Pragmatic Programmers' advice to “learn one language per year.”

Scott holds a B.S. in Engineering Science and Mechanics from Virginia Tech, and an M. Eng. in Systems Engineering from the University of Maryland. Scott speaks at the No Fluff Just Stuff Symposiums and various other conferences. In his (sparse) spare time, Scott enjoys spending time with his wife, three children, and cat. He also tries to find time to play soccer, go snowboarding, and mountain bike whenever he can.

Why Attend the NFJS Tour?

  • » Cutting-Edge Technologies
  • » Agile Practices
  • » Peer Exchange

Current Topics:

  • Languages on the JVM: Scala, Groovy, Clojure
  • Enterprise Java
  • Core Java, Java 8
  • Agility
  • Testing: Geb, Spock, Easyb
  • REST
  • NoSQL: MongoDB, Cassandra
  • Hadoop
  • Spring 4
  • Cloud
  • Automation Tools: Gradle, Git, Jenkins, Sonar
  • HTML5, CSS3, AngularJS, jQuery, Usability
  • Mobile Apps - iPhone and Android
  • More...
Learn More »