1
Vote

Time-based UUID has incorrect format

description

I've discovered that the TimeGenerator.GetTimeUUID() method does not produce proper Type 1 UUIDs. The format is not compatible with the Cassandra TimeUUIDType comparator and produce a "Bad Request: TimeUUID supports only version 1 UUIDs" error when used as part of a CQL statement. In addition they cannot be decoded by the unix uuid command which produces the following output:
uuid -d -F str ce940a70-f0a8-e111-a032-a102c454e529
encode: STR: ce940a70-f0a8-e111-a032-a102c454e529
    SIV:     274589638838886627173931652770604311849
decode: variant: DCE 1.1, ISO/IEC 11578:1996
    version: 14 (unknown)
    content: CE:94:0A:70:F0:A8:01:11:20:32:A1:02:C4:54:E5:29
             (not decipherable: unknown UUID version)

file attachments

comments

shawn_ wrote May 28, 2012 at 10:27 PM

I've found that the FluentCassandra does produce compatible Type 1 UUIDs. I created one using the GuidGenerator.GenerateTimeBasedGuid() method ran it through uuid and got the correct output:
uuid -d -F str a3d7f91a-a90c-11e1-a543-080027335609
encode: STR: a3d7f91a-a90c-11e1-a543-080027335609
    SIV:     217785559569767368072561636003842053641
decode: variant: DCE 1.1, ISO/IEC 11578:1996
    version: 1 (time and node based)
    content: time:  2012-05-28 21:32:35.136745.0 UTC
             clock: 9539 (usually random)
             node:  08:00:27:33:56:09 (global unicast)

I believe the problem is the way the .NET Guid type stores the timestamp differently than in Java. Nick Berardi on the FluentCassandra project had the same issue a while ago and documented there issue here:
http://coderjournal.com/2010/04/creating-a-time-uuid-guid-in-net/

sabro wrote May 29, 2012 at 8:03 PM

Please try to watch Cassandra data by cassandra-cli.

// make column family
create column family Uuid with comparator = 'TimeUUIDType';

// Insert by .net app
...

// watch data and decode by unix uuid command
list Uuid;


Probably uuid of cassandraemon is right.
And uuid of FluentCassandra occur error while register data.

I think reason that Java use BigEndian, or Cassandra mistake to decode uuid.

kojiishi wrote May 30, 2012 at 1:54 AM

If I understand correctly, cassandra-cli changed its TimeUUIDType at some version. Is this related?

shawn_ wrote May 30, 2012 at 7:28 PM

Ok I've figured out exactly what the problem is now. The problem only occurs when using CQL. The normal API works fine. The reason I'm getting the errors is the .ToString() method on the System.Guid produces an incompatible string when used to build my CQL statements. It seems like the first 4 bytes (and the next 2) are jumbled up (probably Endian difference between Java and .NET). I created a method to convert a Guid to a Cassandra compatible string and the problem was fixed. Here is the method

public static string ToCQLString(Guid g)
    {
        var bytes = g.ToCassandraByte();
        StringBuilder sb = new StringBuilder(bytes.Length * 2);
        for(int i =0; i<16; i++)
        {
            if(i == 4 || i == 6 || i == 8 || i == 10)
            {
                sb.Append("-");
            }
            var b = bytes[i];
            sb.AppendFormat("{0:x2}", b);

        }
        return sb.ToString();
    }
Here is an example:
Guid.ToString()
ce98f5a2-8baa-e111-b561-eb658ed87338
ToCQLString(g);
a2f598ce-aa8b-11e1-b561-eb658ed87338

I'm also uploading a program with my tests (normal API and CQL) and the working example of how the ToCQLString is used.

Sabro, please consider adding this an an extension method or in a utility class to Cassandraemon.

sabro wrote May 30, 2012 at 10:40 PM

Great!

I promise to add this method in next version.

shawn_ wrote May 31, 2012 at 12:03 AM

I was thinking a better fix might be to change the internal format of how the Time-based UUID is stored in the Guid class so its stored in the correct format instead of needing to convert it all the time. This is the way the FluentCassandra handles this. You can see there implementation here:
https://github.com/managedfusion/fluentcassandra/blob/master/src/GuidGenerator.cs

kojiishi wrote May 31, 2012 at 8:13 AM

I have to admit I'm not fully understanding this issue, but thought this information might be helpful for you so sharing here.

http://grokbase.com/t/cassandra/commits/119ja9vxqw/jira-created-cassandra-3227-cassandra-cli-use-micro-second-timestamp-but-cql-use-milli-second

cassandra-cli set micro second timestamp by FBUtilities.timestampMicros. But CQL insert or update operation set milli second timestamp by AbstractModification.getTimestamp.

If you register data by cassandra-cli, you can't update data by CQL. Because CQL timestamp is judged as past time.

kojiishi wrote May 31, 2012 at 8:14 AM

...or maybe not related much? The issue seems to be about format, not the resolution. Sorry if this is a noise.

sabro wrote Jun 1, 2012 at 4:18 AM

shawn_
I will consider it.
kojiishi
It might be not related. It is fixed in ver 1.0.