{"id":13224,"date":"2017-07-24T20:53:52","date_gmt":"2017-07-24T20:53:52","guid":{"rendered":"http:\/\/www.doanduyhai.com\/blog\/?p=13224"},"modified":"2017-08-02T15:27:05","modified_gmt":"2017-08-02T15:27:05","slug":"gremlin-recipes-1-gremlin-as-a-stream","status":"publish","type":"post","link":"https:\/\/www.doanduyhai.com\/blog\/?p=13224","title":{"rendered":"Gremlin recipes: 1 &#8211; Gremlin as a Stream"},"content":{"rendered":"<p>This blog post belong to a series called <strong>Gremlin Recipes<\/strong>. The purpose is to explain the internal of <strong><a href=\"http:\/\/tinkerpop.apache.org\/docs\/3.2.5\/reference\/#traversal\" target=\"_blank\">Gremlin<\/a><\/strong> and give people a deeper insight into the query language to master it.<\/p>\n<p><!--more--><\/p>\n<h1>I KillrVideo dataset<\/h1>\n<p>To illustrate this series of recipes, you need first to create the schema for <strong>KillrVideo<\/strong> and import the data.<\/p>\n<p>The graph schema of this dataset is :<\/p>\n<p><iframe loading=\"lazy\" src=\"https:\/\/s3.amazonaws.com\/datastax-graph-schema-viewer\/index.html#\/?schema=killr_video_small.json\" height=\"600px\" width=\"100%\"><\/iframe><\/p>\n<p><span id=\"killrvideo_dataset\"><strong>INSERTING DATA<\/strong><\/span><br \/>\nFirst, open the <strong><a href=\"http:\/\/tinkerpop.apache.org\/docs\/3.1.1-incubating\/tutorials\/the-gremlin-console\/\" target =\"_blank\">Gremlin console<\/a><\/strong> or <strong><a href=\"https:\/\/www.datastax.com\/products\/datastax-studio-and-development-tools#DataStax-Studio\" target=\"_blank\">Datastax Studio<\/a><\/strong> (whichever works fine) and execute the following statements:<\/p>\n<div class=\"accordion \"  ><br \/>\n  <div class=\"accordion-group\">\r\n        <div class=\"accordion-heading\">\r\n        <a class=\"accordion-toggle\" data-toggle=\"collapse\" \r\n        data-parent=\"#accordion2\" href=\"#accordian_item_986\">Open-source Gremlin Console config<\/a>\r\n        <\/div>\r\n         <div id=\"accordian_item_986\" class=\"accordion-body collapse in\">\r\n            <div class=\"accordion-inner\"><\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\n:remote connect tinkerpop.server conf\/remote.yaml session-manage\r\n:remote config timeout max\r\n:remote console\r\nsystem.graph('KillrVideo').create()\r\n:remote config alias g KillrVideo.g\r\n<\/pre>\n<p><\/div>\r\n         <\/div>\r\n        <\/div><\/div>\n<div class=\"accordion \"  ><br \/>\n  <div class=\"accordion-group\">\r\n        <div class=\"accordion-heading\">\r\n        <a class=\"accordion-toggle\" data-toggle=\"collapse\" \r\n        data-parent=\"#accordion2\" href=\"#accordian_item_418\">KillrVideo schema & data loading<\/a>\r\n        <\/div>\r\n         <div id=\"accordian_item_418\" class=\"accordion-body collapse in\">\r\n            <div class=\"accordion-inner\"><\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\nschema.clear();\r\n\/\/ Property keys \r\nschema.propertyKey(&quot;genreId&quot;).Text().create(); \r\nschema.propertyKey(&quot;personId&quot;).Text().create(); \r\nschema.propertyKey(&quot;userId&quot;).Text().create(); \r\nschema.propertyKey(&quot;movieId&quot;).Text().create(); \r\nschema.propertyKey(&quot;name&quot;).Text().create(); \r\nschema.propertyKey(&quot;age&quot;).Int().create(); \r\nschema.propertyKey(&quot;gender&quot;).Text().create(); \r\nschema.propertyKey(&quot;title&quot;).Text().create(); \r\nschema.propertyKey(&quot;year&quot;).Int().create(); \r\nschema.propertyKey(&quot;duration&quot;).Int().create(); \r\nschema.propertyKey(&quot;country&quot;).Text().create(); \r\nschema.propertyKey(&quot;production&quot;).Text().multiple().create(); \r\nschema.propertyKey(&quot;rating&quot;).Int().create();\r\n\r\n\/\/ Vertex labels\r\nschema.vertexLabel(&quot;genre&quot;).properties(&quot;genreId&quot;,&quot;name&quot;).create();\r\nschema.vertexLabel(&quot;person&quot;).properties(&quot;personId&quot;,&quot;name&quot;).create();\r\nschema.vertexLabel(&quot;user&quot;).properties(&quot;userId&quot;,&quot;age&quot;,&quot;gender&quot;).create();\r\nschema.vertexLabel(&quot;movie&quot;).properties(&quot;movieId&quot;,&quot;title&quot;,&quot;year&quot;,&quot;duration&quot;,&quot;country&quot;,&quot;production&quot;).create();\r\n\r\n\/\/ Edge labels\r\nschema.edgeLabel(&quot;knows&quot;).connection(&quot;user&quot;,&quot;user&quot;).create();\r\nschema.edgeLabel(&quot;rated&quot;).single().properties(&quot;rating&quot;).connection(&quot;user&quot;,&quot;movie&quot;).create();\r\nschema.edgeLabel(&quot;belongsTo&quot;).single().connection(&quot;movie&quot;,&quot;genre&quot;).create();\r\nschema.edgeLabel(&quot;actor&quot;).connection(&quot;movie&quot;,&quot;person&quot;).create();\r\nschema.edgeLabel(&quot;director&quot;).single().connection(&quot;movie&quot;,&quot;person&quot;).create();\r\n\r\n\/\/ Vertex indexes\r\nschema.vertexLabel(&quot;genre&quot;).index(&quot;genresById&quot;).materialized().by(&quot;genreId&quot;).add();\r\nschema.vertexLabel(&quot;genre&quot;).index(&quot;genresByName&quot;).materialized().by(&quot;name&quot;).add();\r\nschema.vertexLabel(&quot;person&quot;).index(&quot;personsById&quot;).materialized().by(&quot;personId&quot;).add();\r\nschema.vertexLabel(&quot;person&quot;).index(&quot;personsByName&quot;).materialized().by(&quot;name&quot;).add();\r\nschema.vertexLabel(&quot;user&quot;).index(&quot;usersById&quot;).materialized().by(&quot;userId&quot;).add();\r\nschema.vertexLabel(&quot;user&quot;).index(&quot;search&quot;).search().by(&quot;age&quot;).by(&quot;gender&quot;).asString().add();\r\nschema.vertexLabel(&quot;movie&quot;).index(&quot;moviesById&quot;).materialized().by(&quot;movieId&quot;).add();\r\nschema.vertexLabel(&quot;movie&quot;).index(&quot;moviesByTitle&quot;).materialized().by(&quot;title&quot;).add();\r\nschema.vertexLabel(&quot;movie&quot;).index(&quot;search&quot;).search().by(&quot;year&quot;).by(&quot;country&quot;).asString().add();\r\n\r\n\/\/ Edge indexes\r\nschema.vertexLabel(&quot;user&quot;).index(&quot;toMoviesByRating&quot;).outE(&quot;rated&quot;).by(&quot;rating&quot;).add();\r\nschema.vertexLabel(&quot;movie&quot;).index(&quot;toUsersByRating&quot;).inE(&quot;rated&quot;).by(&quot;rating&quot;).add();\r\n\r\nschema.config().option(&quot;tx_autostart&quot;).set(true);\r\n\r\n\/\/ Load data from file KillrVideo-small.kryo\r\ngraph.io(IoCore.gryo()).readGraph(&quot;\/path\/to\/KillrVideo-small.kryo&quot;);\r\n<\/pre>\n<p>  <\/div>\r\n         <\/div>\r\n        <\/div><br \/>\n<\/div>\n<p>To be able to perform full scan on all this small dataset, you need to add<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\nschema.config().option('graph.allow_scan').set('true');\r\n<\/pre>\n<p>The file <strong>KillrVideo-small.kryo<\/strong> can be downloaded <a href=\"https:\/\/drive.google.com\/file\/d\/0B3qV2Nx-GibgZU5TeXZYaThFbXM\/view?usp=sharing\" target=\"_blank\"><strong>here<\/strong><\/a><\/p>\n<h1>II Gremlin as a Stream of data<\/h3>\n<p>Usually, any graph traversal starts with <code>g.V()<\/code> but what is the type of this expression ? To know that, just execute the below query<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt; g.V().getClass()\r\n==&gt;class org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.DefaultGraphTraversal  \r\n<\/pre>\n<p>The type of <code><strong>g.V()<\/strong><\/code> is indeed a <code><strong>DefaultGraphTraversal<\/strong><\/code>. According to the <a href=\"http:\/\/tinkerpop.apache.org\/javadocs\/3.2.4\/full\/org\/apache\/tinkerpop\/gremlin\/process\/traversal\/dsl\/graph\/DefaultGraphTraversal.html\" target=\"_blank\"><strong>Gremlin API<\/strong><\/a>, <code><strong>DefaultGraphTraversal<\/strong><\/code> implements <code><strong>Iterator&lt;E&gt;<\/strong><\/code>. In fact <code><strong>DefaultGraphTraversal<\/strong><\/code> is an <code><strong>Iterator<\/strong><\/code> of <code><strong>Vertex<\/strong><\/code>. <\/p>\n<p>If we had done <code><strong>g.E().getClass()<\/strong><\/code> it would be an <code><strong>Iterator<\/strong><\/code> of <code><strong>Edge<\/strong><\/code><\/p>\n<p>To confirm that <code><strong>g.V()<\/strong><\/code> is an iterator of vertex, let&#8217;s just execute this code<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt; g.V().next().getClass()\r\n==&gt;class com.datastax.bdp.graph.impl.element.vertex.DsegCachedVertexImpl\r\n<\/pre>\n<p>Since it&#8217;s an iterator, we can call method <code><strong>next()<\/strong><\/code> on it and call  <code><strong>getClass()<\/strong><\/code> to get its type. As we can see <code><strong>DsegCachedVertexImpl<\/strong><\/code> is just an implementation of Gremlin <code><strong>Vertex<\/strong><\/code> interface. Of course since it&#8217;s an iterator, we can invoke as well other methods on <code><strong>Iterator<\/strong><\/code> interface like <strong><code>hasNext()<\/code><\/strong>, <strong><code>tryNext()<\/code><\/strong>, <strong><code>next(n)<\/code><\/strong>, <strong><code>limit(n)<\/code><\/strong>, <strong><code>tail(n)<\/code><\/strong>, <strong><code>toList()<\/code><\/strong>, <strong><code>toSet()<\/code><\/strong> &#8230;<\/p>\n<p>So far so good, now let&#8217;s analyse a very simple traversal<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                            \r\n    hasLabel(&quot;person&quot;).             \r\n    has(&quot;name&quot;, &quot;Harrison Ford&quot;).   \r\n    next()\r\n==&gt;v[{~label=person, community_id=1425165696, member_id=527}]\r\n<\/pre>\n<p>Below is a matrix giving the equivalent stream processing for each Gremlin step:<\/p>\n<table>\n<thead>\n<tr>\n<th>Gremlin step<\/th>\n<th>Stream equivalent<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>g.V()<\/td>\n<td>Iterator<Vertex> iterator = &#8230;<\/td>\n<\/tr>\n<tr>\n<td>.hasLabel(&#8220;person&#8221;)<\/td>\n<td>iterator.stream().filter(vertex -> vertex instanceof Person) == Iterator&lt;Person&gt;<\/td>\n<\/tr>\n<tr>\n<td>.has(&#8220;name&#8221;, &#8220;Harrison Ford&#8221;)<\/td>\n<td>iterator.stream().filter(person -> person.getName().equals(&#8220;Harrison Ford&#8221;)) == Iterator&lt;Person&gt;<\/td>\n<\/tr>\n<tr>\n<td>.next()<\/td>\n<td>iterator.findFirst().get()<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The above traversal just fetch the whole vertex <strong>Harrison Ford<\/strong>. What if we only want to retrieve the name instead of the complete vertex ?<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                            \r\n    hasLabel(&quot;person&quot;).             \r\n    has(&quot;name&quot;, &quot;Harrison Ford&quot;). \r\n    values(&quot;name&quot;).  \/\/ iterator.map(person -&gt; person.getName()) == Iterator&lt;String&gt;\r\n    next()\r\n==&gt;Harrison Ford\r\n<\/pre>\n<p>The step <code><strong>values(\"name\")<\/strong><\/code> is equivalent to a map operation of Java stream to extract only the name property of the vertex.<\/p>\n<p>Now, if you wish to retrieve more properties from the vertex, you can either use <strong><code>values(\"property1\", \"property2\", ...,\"propertyN\")<\/code><\/strong> or use the step <code><strong>valueMap(\"property1\", \"property2\", ..., \"propertyN\")<\/strong><\/code> <\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                            \r\n    hasLabel(&quot;person&quot;).             \r\n    has(&quot;name&quot;, &quot;Harrison Ford&quot;). \r\n    valueMap(&quot;name&quot;, &quot;personId&quot;). \/\/ iterator.map(person -&gt; ImmutableMap.of(&quot;name&quot;,person.getName(), &quot;id&quot;, person.getPersonId()))\r\n    next()\r\n==&gt;name=[Harrison Ford]\r\n==&gt;personId=[p3211]\r\n<\/pre>\n<p>The step <code><strong>valueMap(\"name\", \"personId\")<\/strong><\/code> will transform the vertex object into a <strong>Map&lt;String, Object&gt; structure<\/strong>. <strong>The key of the map is the property label, the value of the Map is the corresponding value<\/strong>. <\/p>\n<p>To prove it, just call <code><strong>getClass()<\/strong><\/code> again<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                            \r\n    hasLabel(&quot;person&quot;).             \r\n    has(&quot;name&quot;, &quot;Harrison Ford&quot;). \r\n    valueMap(&quot;name&quot;, &quot;personId&quot;).\r\n    next().\r\n    getClass()\r\n==&gt;class java.util.HashMap\r\n<\/pre>\n<p>We get a <strong><code>java.util.HashMap<\/code><\/strong> as expected.<\/p>\n<h1 id=\"group_by\">III Grouping the stream<\/h1>\n<p>Now let&#8217;s do something crazier, let&#8217;s just count the number of movies released by year in our dataset. For that we use this traversal<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                    \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;).      \/\/ Iterator&lt;Movie&gt;   \r\n    groupCount().by(&quot;year&quot;) \/\/ Iterator&lt;Map&lt;Int == year of release, Long = count&gt;&gt;\r\n==&gt;{1921=1, 1925=1, 1926=1, 1927=2, 1931=1, 1933=1, 1934=1, 1935=2, 1936=1, 1937=2, 1938=2, 1939=4, 1940=6, 1941=3, 1942=3, 1943=1, 1944=2, 1945=2, 1946=3, 1948=8, 1949=4, 1950=8, 1951=6, 1952=8, 1953=8, 1954=9, 1955=5, 1956=5, 1957=7, 1958=4, 1959=8, 1960=7, 1961=8, 1962=6, 1963=3, 1964=8, 1965=4, 1966=6, 1967=6, 1968=8, 1969=6, 1970=9, 1971=4, 1972=6, 1973=5, 1974=4, 1975=7, 1976=4, 1977=5, 1978=2, 1979=8, 1980=8, 1981=3, 1982=10, 1983=4, 1984=13, 1985=8, 1986=7, 1987=6, 1988=12, 1989=12, 1990=13, 1991=10, 1992=8, 1993=16, 1994=21, 1995=27, 1996=16, 1997=28, 1998=29, 1999=26, 2000=24, 2001=30, 2002=23, 2003=36, 2004=45, 2005=35, 2006=40, 2007=32, 2008=39, 2009=28, 2010=20, 2011=19, 2012=19, 2013=12, 2014=4, 2015=1, 1903=1}\r\n<\/pre>\n<p>So the interesting step here is the <code><strong>groupCount().by(\"year\")<\/strong><\/code>. This steps performs 2 actions in one: a <strong>grouping<\/strong> followed by a <strong>counting<\/strong>. The grouping is executed on the property <code><em>year<\/em><\/code> as grouping key. The result of the grouping is <code><strong>Iterator&lt;Map&lt;Int == year of release, Collection&lt;Movie&gt;&gt;&gt;<\/strong><\/code>. The counting step will transform this iterator into <code><strong>Iterator&lt;Map&lt;Int == year of release, Long == number of movies in this year&gt;&gt;<\/strong><\/code><\/p>\n<blockquote><p>One interesting thing to notice here is that we have a nested collection structure. The outer collection is the <code><strong>Iterator<\/strong><\/code> and the inner collection is the <strong><code>Map&lt;Int,Long&gt;<\/code><\/strong>.\n<\/p><\/blockquote>\n<p>Because all the grouping is result is stored inside the map, the iterator contains only a single element, which is the map itself. We can prove it with:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                     \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;).       \/\/ Iterator&lt;Movie&gt;   \r\n    groupCount().by(&quot;year&quot;). \/\/ Iterator&lt;Map&lt;Int == year of release, Long = count&gt;&gt;\r\n    count()\r\n==&gt;1\r\n<\/pre>\n<p>Now, what it we want to access the content of the inner collection(the map here) inside the iterator ? Of course we can always invoke <strong><code>next()<\/code><\/strong> but Gremlin also offers an interesting alternative: <strong><code>unfold()<\/code><\/strong>. It is equivalent to the Java stream <strong><code>flatMap()<\/code><\/strong> method. Let&#8217;s see it in action:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                     \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;).       \/\/ Iterator&lt;Movie&gt;   \r\n    groupCount().by(&quot;year&quot;). \/\/ Iterator&lt;Map&lt;Int == year of release, Long = count&gt;&gt;\r\n    unfold().                \/\/ iterator.stream().flatMap() == Iterator&lt;MapEntry&lt;Int,Long&gt;&gt;\r\n    next()\r\n==&gt;1921=1\r\n<\/pre>\n<h1>IV Advanced grouping in Gremlin<\/h1>\n<p>The grouping we have seen above (<strong><code>groupCount().by(\"year\")<\/code><\/strong>) is just a special case of a more general grouping form:  <strong><code>group().by(xxx).by(yyy)<\/code><\/strong><\/p>\n<p>The first <strong><code>by(xxx)<\/code><\/strong> defines the grouping key e.g. the property\/value on which the grouping is done. It can be a property of the vertex\/edge itself or something more complicated like a <strong>complete traversal<\/strong>.<\/p>\n<p>The second <strong><code>by(yyy)<\/code><\/strong> can be:<\/p>\n<ol>\n<li>either a <strong>reducing step<\/strong> like <strong><code>count()<\/code><\/strong> to reduce the collection of matching vertices for each grouping key<\/li>\n<li>or a <strong>projection step<\/strong> to transform the matching collection of vertices into a collection of something else. Something else can be a scalar value or vertices\/edges<\/li>\n<\/ol>\n<p>Let&#8217;s put all this into practice. So first, the equivalent of our previous <strong><code>groupCount().by(\"year\")<\/code><\/strong> can be rewritten as:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().               \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;). \/\/ Iterator&lt;Movie&gt;   \r\n    group().\r\n      by(&quot;year&quot;).      \/\/ iterator = Iterator&lt;Map&lt;Int == year of release, Collection&lt;Movie&gt;&gt;\r\n      by(count())      \/\/ iterator.stream().map(mapStructure -&gt; mapStructure.stream().collect(Collectors.toMap(key -&gt; key, value -&gt; value.size()))) \r\n==&gt;{1921=1, 1925=1, 1926=1, 1927=2, 1931=1, 1933=1, 1934=1, 1935=2, 1936=1, 1937=2, 1938=2, 1939=4, 1940=6, 1941=3, 1942=3, 1943=1, 1944=2, 1945=2, 1946=3, 1948=8, 1949=4, 1950=8, 1951=6, 1952=8, 1953=8, 1954=9, 1955=5, 1956=5, 1957=7, 1958=4, 1959=8, 1960=7, 1961=8, 1962=6, 1963=3, 1964=8, 1965=4, 1966=6, 1967=6, 1968=8, 1969=6, 1970=9, 1971=4, 1972=6, 1973=5, 1974=4, 1975=7, 1976=4, 1977=5, 1978=2, 1979=8, 1980=8, 1981=3, 1982=10, 1983=4, 1984=13, 1985=8, 1986=7, 1987=6, 1988=12, 1989=12, 1990=13, 1991=10, 1992=8, 1993=16, 1994=21, 1995=27, 1996=16, 1997=28, 1998=29, 1999=26, 2000=24, 2001=30, 2002=23, 2003=36, 2004=45, 2005=35, 2006=40, 2007=32, 2008=39, 2009=28, 2010=20, 2011=19, 2012=19, 2013=12, 2014=4, 2015=1, 1903=1}<\/pre>\n<p>As expected the result is identical to what we get earlier.<\/p>\n<p>Now let&#8217;s say we want to perform a projection of the collection of movies instead of a reduction, we want to have, for each year, a list of movies title instead of movies count, the previous traversal becomes:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().               \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;). \/\/ Iterator&lt;Movie&gt;   \r\n    group().\r\n      by(&quot;year&quot;).      \/\/ Iterator&lt;Map&lt;Int == year of release, Collection&lt;Movie&gt;&gt;\r\n      by(&quot;title&quot;).     \/\/ Iterator&lt;Map&lt;Int == year of release, Collection&lt;String == movie title&gt;&gt;\r\n      unfold().        \/\/ Iterator&lt;MapEntry&lt;Int == year of release, Collection&lt;String == movie title&gt;&gt;\r\n      take(10)\r\n==&gt;1921=[The Kid]\r\n==&gt;1925=[The Gold Rush]\r\n==&gt;1926=[The General]\r\n==&gt;1927=[Metropolis, Sunrise]\r\n==&gt;1931=[City Lights]\r\n==&gt;1933=[Duck Soup]\r\n==&gt;1934=[It Happened One Night]\r\n==&gt;1935=[A Night at the Opera, Top Hat]\r\n==&gt;1936=[Modern Times]\r\n==&gt;1937=[Snow White and the Seven Dwarfs, Captains Courageous]\r\n<\/pre>\n<p>So far, so good. Now instead of doing a simple projection on a Movie property, let&#8217;s do a traversal instead. We want to group the movies by year and for each year, we display the list of movie director name:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                                  \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;).                    \/\/ Iterator&lt;Movie&gt;   \r\n    group().\r\n      by(&quot;year&quot;).                         \/\/ Iterator&lt;Map&lt;Int == year of release, Collection&lt;Movie&gt;&gt;\r\n      by(out(&quot;director&quot;).values(&quot;name&quot;)). \/\/ Iterator&lt;Map&lt;Int == year of release, Collection&lt;String == director name&gt;&gt;\r\n      unfold().                           \/\/ Iterator&lt;MapEntry&lt;Int == year of release, Collection&lt;String == movie title&gt;&gt;\r\n      take(10)\r\n==&gt;1921=Charles Chaplin\r\n==&gt;1925=Charles Chaplin\r\n==&gt;1926=Buster Keaton\r\n==&gt;1927=F.W. Murnau\r\n==&gt;1931=Charles Chaplin\r\n==&gt;1933=Leo McCarey\r\n==&gt;1934=Frank Capra\r\n==&gt;1935=Mark Sandrich\r\n==&gt;1936=Charles Chaplin\r\n==&gt;1937=Victor Fleming\r\n<\/pre>\n<p>Strangely enough, we only get 1 director name for each year, which is not correct. It looks like a bug in Gremlin but it isn&#8217;t.<\/p>\n<blockquote><p>When we traverse the edge &#8220;director&#8221; from the vertex &#8220;movie&#8221;, it is a 1-to-N relationship because we can have for each movie more than 1 directors so we&#8217;ll end up having Collection&lt;Collection&lt;String == director name&gt;&gt; and thus a <strong>combinatory explosion<\/strong>. By design, Gremlin only call <strong><code>next()<\/code><\/strong> on inner traversal, maybe to avoid such explosion (this is my assumption, to be confirmed).\n<\/p><\/blockquote>\n<p>To force Gremlin to be exhaustive on the inner traversal <strong><code>out(\"director\").values(\"name\")<\/code><\/strong>, we can use the <strong><code>fold()<\/code><\/strong> step<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                                         \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;).                           \/\/ Iterator&lt;Movie&gt;   \r\n    group().\r\n      by(&quot;year&quot;).                                \/\/ Iterator&lt;Map&lt;Int == year of release, Collection&lt;Movie&gt;&gt;\r\n      by(out(&quot;director&quot;).values(&quot;name&quot;).fold()). \/\/ Iterator&lt;Map&lt;Int == year of release, String == director names&gt;&gt;\r\n      unfold().                                  \/\/ Iterator&lt;MapEntry&lt;Int == year of release, Collection&lt;String == movie title&gt;&gt;\r\n      take(10)\r\n==&gt;1921=[Charles Chaplin]\r\n==&gt;1925=[Charles Chaplin]\r\n==&gt;1926=[Buster Keaton, Clyde Bruckman]\r\n==&gt;1927=[Fritz Lang, F.W. Murnau]\r\n==&gt;1931=[Charles Chaplin]\r\n==&gt;1933=[Leo McCarey]\r\n==&gt;1934=[Frank Capra]\r\n==&gt;1935=[Sam Wood, Mark Sandrich]\r\n==&gt;1936=[Charles Chaplin]\r\n==&gt;1937=[David Hand, Victor Fleming]\r\n<\/pre>\n<p>Here we are!<\/p>\n<p>Finally, we can also replace the grouping step by a <strong>complete traversal<\/strong> instead of a simple vertex property. Let&#8217;s say we want to group movies by director name and display the movies title for each director:<\/p>\n<pre class=\"brush: java; title: ; wrap-lines: false; notranslate\" title=\"\">\r\ngremlin&gt;g.\r\n    V().                                  \/\/ Iterator&lt;Vertex&gt;    \r\n    hasLabel(&quot;movie&quot;).                    \/\/ Iterator&lt;Movie&gt;   \r\n    group().\r\n      by(out(&quot;director&quot;).values(&quot;name&quot;)). \/\/ Iterator&lt;Map&lt;String == director name, Collection&lt;Movie&gt;&gt;\r\n      by(&quot;title&quot;).                        \/\/ Iterator&lt;Map&lt;String == director name, Collection&lt;String&gt; == movie title&gt;&gt;\r\n      unfold().                           \/\/ Iterator&lt;MapEntry&lt;String == director name, Collection&lt;String == movie title&gt;&gt;\r\n      take(10)\r\n==&gt;Kwak Jae-young=[My Sassy Girl]\r\n==&gt;Peter Jackson=[The Lord of the Rings: The Two Towers, The Lord of the Rings: The Fellowship of the Ring, The Hobbit: The Desolation of Smaug, The Lord of the Rings: The Return of the King, King Kong, The Hobbit: An Unexpected Journey]\r\n==&gt;Alejandro Agresti=[The Lake House]\r\n==&gt;George Pan Cosmatos=[Tombstone]\r\n==&gt;Pitof=[Catwoman]\r\n==&gt;Raman Hui=[Shrek the Third]\r\n==&gt;Michael Bay=[Transformers, Armageddon, The Island, Pearl Harbor, Bad Boys, The Rock]\r\n==&gt;Santiago Segura=[Torrente, el brazo tonto de la ley, Torrente 2: Mission in Marbella, Torrente 3]\r\n==&gt;Robert Stevenson=[Mary Poppins]\r\n==&gt;Clare Kilner=[The Wedding Date]\r\n<\/pre>\n<p>Again, putting <strong><code>out(\"director\").values(\"name\")<\/code><\/strong> inside the first <strong><code>by()<\/code><\/strong> instead of &#8220;year&#8221; will do the trick.<\/p>\n<p>And that&#8217;s all folks! <strong>Do not miss the other Gremlin recipes in this series<\/strong>.<\/p>\n<p>If you have any question about <strong>Gremlin<\/strong>, find me on the <strong><a href=\"http:\/\/datastaxacademy.slack.com\" target=\"_blank\">datastaxacademy.slack.com<\/a><\/strong>, channel <strong>dse-graph<\/strong>. My id is <em>@doanduyhai<\/em>   <\/p>\n","protected":false},"excerpt":{"rendered":"<p>This blog post belong to a series called Gremlin Recipes. The purpose is to explain the internal of Gremlin and give people a deeper insight into the query language to master it.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[58,10],"tags":[],"_links":{"self":[{"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/13224"}],"collection":[{"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=13224"}],"version-history":[{"count":38,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/13224\/revisions"}],"predecessor-version":[{"id":13347,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/13224\/revisions\/13347"}],"wp:attachment":[{"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=13224"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=13224"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.doanduyhai.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=13224"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}