Archive for February, 2010

rto: sfYaml complies for 43% with Yaml 1.2 spec

Monday, February 8th, 2010

For a recent project I needed to write a small domain specific language. I decided to use a language based on YML, as the DSL was also array based.

While reading the spec of of Yaml 1.2, I was surprised some things don’t work in the current sfYaml implementation. The most important one to me is that the short syntax for lists of mappings is not supported (yet), see also example 2.12:

- item    : Super Hoop
  quantity: 1
- item    : Basketball
  quantity: 4
- item    : Big Shoes
  quantity: 1

Below is a concluding table of my findings. Currently only 12 out of 28 examples are parsed by sfYaml. Some errors are easy to fix (the ‘—‘ document separator for instance). Hopefully together we can make the sfYaml 100% compliant to spec 1.2!

NB. Tests were run on a SF1.3 trunk library, link to the used script

2.1. Sequence of Scalars
- Mark McGwire
- Sammy Sosa
- Ken Griffey
sfYaml
spec
1.2
Array
(
    [0] => Mark McGwire
    [1] => Sammy Sosa
    [2] => Ken Griffey
)
2.2. Mapping Scalars to Scalars
hr:  65    # Home runs
avg: 0.278 # Batting average
rbi: 147   # Runs Batted In
sfYaml
spec
1.2
Array
(
    [hr] => 65
    [avg] => 0.278
    [rbi] => 147
)
2.3. Mapping Scalars to Sequences
american:
  - Boston Red Sox
  - Detroit Tigers
  - New York Yankees
national:
  - New York Mets
  - Chicago Cubs
  - Atlanta Braves
sfYaml
spec
1.2
Array
(
    [american] => Array
        (
            [0] => Boston Red Sox
            [1] => Detroit Tigers
            [2] => New York Yankees
        )

    [national] => Array
        (
            [0] => New York Mets
            [1] => Chicago Cubs
            [2] => Atlanta Braves
        )

)
2.4. Sequence of Mappings
-
  name: Mark McGwire
  hr:   65
  avg:  0.278
-
  name: Sammy Sosa
  hr:   63
  avg:  0.288
sfYaml
spec
1.2
Array
(
    [0] => Array
        (
            [name] => Mark McGwire
            [hr] => 65
            [avg] => 0.278
        )

    [1] => Array
        (
            [name] => Sammy Sosa
            [hr] => 63
            [avg] => 0.288
        )

)
2.5. Sequence of Sequences
- [name        , hr, avg  ]
- [Mark McGwire, 65, 0.278]
- [Sammy Sosa  , 63, 0.288]
sfYaml
spec
1.2
Array
(
    [0] => Array
        (
            [0] => name
            [1] => hr
            [2] => avg
        )

    [1] => Array
        (
            [0] => Mark McGwire
            [1] => 65
            [2] => 0.278
        )

    [2] => Array
        (
            [0] => Sammy Sosa
            [1] => 63
            [2] => 0.288
        )

)
2.6. Mapping of Mappings
Mark McGwire: {hr: 65, avg: 0.278}
Sammy Sosa: {
    hr: 63,
    avg: 0.288
  }
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Malformed inline YAML string {
2.7. Two Documents in a Stream
# Ranking of 1998 home runs
---
- Mark McGwire
- Sammy Sosa
- Ken Griffey

# Team ranking
---
- Chicago Cubs
- St Louis Cardinals
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 2 (—).
2.8. Play by Play Feed
---
time: 20:03:20
player: Sammy Sosa
action: strike (miss)
...
---
time: 20:03:47
player: Sammy Sosa
action: grand slam
...
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 4 (…).
2.9. Single Document with Two Comments
---
hr: # 1998 hr ranking
  - Mark McGwire
  - Sammy Sosa
rbi:
  # 1998 rbi ranking
  - Sammy Sosa
  - Ken Griffey
sfYaml
spec
1.2
Array
(
    [hr] => Array
        (
            [0] => Mark McGwire
            [1] => Sammy Sosa
        )

    [rbi] => Array
        (
            [0] => Sammy Sosa
            [1] => Ken Griffey
        )

)
2.10. Node for ‘Sammy Sosa’ appears twice in this document
---
hr:
  - Mark McGwire
  # Following node labeled SS
  - &SS Sammy Sosa
rbi:
  - *SS # Subsequent occurrence
  - Ken Griffey
sfYaml
spec
1.2
Array
(
    [hr] => Array
        (
            [0] => Mark McGwire
            [1] => Sammy Sosa
        )

    [rbi] => Array
        (
            [0] => Sammy Sosa
            [1] => Ken Griffey
        )

)
2.11. Mapping between Sequences
? - Detroit Tigers
  - Chicago cubs
:
  - 2001-07-23

? [ New York Yankees,
    Atlanta Braves ]
: [ 2001-07-02, 2001-08-12,
    2001-08-14 ]
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 1 (? - Detroit Tigers).
2.12. Compact Nested Mapping
---
# Products purchased
- item    : Super Hoop
  quantity: 1
- item    : Basketball
  quantity: 4
- item    : Big Shoes
  quantity: 1
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 3 ( quantity: 1).
2.13. In literals, newlines are preserved
# ASCII Art
--- |
  \//||\/||
  // ||  ||__
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 2 (— |).
2.14. In the folded scalars, newlines become spaces
--- >
  Mark McGwire's
  year was crippled
  by a knee injury.
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 1 ( Mark McGwire’s).
2.15. Folded newlines are preserved for ‘more indented’ and blank lines
>
 Sammy Sosa completed another
 fine season with great stats.

   63 Home Runs
   0.288 Batting Average

 What a year!
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 1 (>).
2.16. Indentation determines scope
name: Mark McGwire
accomplishment: >
  Mark set a major league
  home run record in 1998.
stats: |
  65 Home Runs
  0.278 Batting Average
sfYaml
spec
1.2
Array
(
    [name] => Mark McGwire
    [accomplishment] => Mark set a major league home run record in 1998.

    [stats] => 65 Home Runs
0.278 Batting Average

)
2.17. Quoted Scalars
unicode: "Sosa did fine.\u263A"
control: "\b1998   1999    2000
"
hex esc: "
 is 
"

single: '"Howdy!" he cried.'
quoted: ' # Not a ''comment''.'
tie-fighter: '|\-*-/|'
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Malformed inline YAML string ("\b1998 1999 2000).
2.18. Multi-line Flow Scalars
plain:
  This unquoted scalar
  spans many lines.

quoted: "So does this
  quoted scalar.
"
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 2 (This unquoted scalar).
2.19. Integers
canonical: 12345
decimal: +12345
octal: 0o14
hexadecimal: 0xC
sfYaml
spec
1.2
Array
(
    [canonical] => 12345
    [decimal] => 12345
    [octal] => 0o14
    [hexadecimal] => 12
)
2.20. Floating Point
canonical: 1.23015e+3
exponential: 12.3015e+02
fixed: 1230.15
negative infinity: -.inf
not a number: .NaN
sfYaml
spec
1.2
Array
(
    [canonical] => 1230.15
    [exponential] => 1230.15
    [fixed] => 1230.15
    [negative infinity] => -INF
    [not a number] => INF
)
2.21. Miscellaneous
null:
booleans: [ true, false ]
string: '012345'
sfYaml
spec
1.2
Array
(
    [] => 
    [booleans] => Array
        (
            [0] => 1
            [1] => 
        )

    [string] => 012345
)
2.22. Timestamps
canonical: 2001-12-15T02:59:43.1Z
iso8601: 2001-12-14t21:59:43.10-05:00
spaced: 2001-12-14 21:59:43.10 -5
date: 2002-12-14
sfYaml
spec
1.2
Array
(
    [canonical] => 1008385183
    [iso8601] => 1008385183
    [spaced] => 1008385183
    [date] => 1039824000
)
2.23. Various Explicit Tags
---
not-date: !!str 2002-04-28

picture: !!binary |
 R0lGODlhDAAMAIQAAP//9/X
 17unp5WZmZgAAAOfn515eXv
 Pz7Y6OjuDg4J+fn5OTk6enp
 56enmleECcgggoBADs=

application specific tag: !something |
 The semantics of the tag
 above may be different for
 different documents.
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 4 ( R0lGODlhDAAMAIQAAP//9/X).
2.24. Global Tags
%TAG ! tag:clarkevans.com,2002:
--- !shape
  # Use the ! handle for presenting
  # tag:clarkevans.com,2002:circle
- !circle
  center: &ORIGIN {x: 73, y: 129}
  radius: 7
- !line
  start: *ORIGIN
  finish: { x: 89, y: 102 }
- !label
  start: *ORIGIN
  color: 0xFFEEBB
  text: Pretty vector drawing.
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 2 (— !shape).
2.25. Unordered Sets
# Sets are represented as a
# Mapping where each key is
# associated with a null value
--- !!set
? Mark McGwire
? Sammy Sosa
? Ken Griff
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 4 (— !!set).
2.26. Ordered Mappings
# Ordered maps are represented as
# A sequence of mappings, with
# each mapping having one key
--- !!omap
- Mark McGwire: 65
- Sammy Sosa: 63
- Ken Griffy: 58
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 4 (— !!omap).
2.27. Invoice
--- !<tag:clarkevans.com,2002:invoice>
invoice: 34843
date   : 2001-01-23
bill-to: &id001
  given  : Chris
  family : Dumars
  address:
    lines: |
      458 Walkman Dr.
      Suite #292
    city    : Royal Oak
    state   : MI
    postal  : 48046
ship-to: *id001
product:
  - sku         : BL394D
    quantity    : 4
    description : Basketball
    price       : 450.00
  - sku         : BL4438H
    quantity    : 1
    description : Super Hoop
    price       : 2392.00
tax  : 251.42
total: 4443.52
comments:
  Late afternoon is best.
  Backup contact is Nancy
  Billsmer @ 338-4338.
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 16 ( quantity : 4).
2.28. Log File
---
Time: 2001-11-23 15:01:42 -5
User: ed
Warning:
  This is an error message
  for the log file
---
Time: 2001-11-23 15:02:31 -5
User: ed
Warning:
  A slightly different error
  message.
---
Date: 2001-11-23 15:03:17 -5
User: ed
Fatal:
  Unknown variable "bar"
Stack:
  - file: TopClass.py
    line: 23
    code: |
      x = MoreObject("345
")
  - file: MoreClass.py
    line: 58
    code: |-
      foo = bar
sfYaml
spec
1.2
Error, unable to load YML: Unable to parse string: Unable to parse line 4 (This is an error message).

rto: Access Git with PHP, go GLIP!

Monday, February 1st, 2010

GLIP, a php library started by Patrik Fimml, allows you to interact with GIT repositories on your system. Although the library is not complete, it is already quite useful. Plus it’s the only library (yet) for PHP-GIT access, so let’s give it a go!

This blog will use the code from my fork on Github. I have changed the API of the original library to be more user friendly, the internals are almost the same. There’s quite a lot of activity on different forks of Glip, where everybody is focussing on different aspects. I hope all will merge into a full, single, PHP library with the functionality of Ruby’s GRIT.

Before diving into code, it’s important to understand (a little) how GIT works internally. Scott Schacon, one of the developers of GIT, has a very good presentation on the GIT internals, see this movie! Although it’s almost an hour long, the first twenty or so minutes are the best. It’s still a good view to see all of the presentation though. Other good documentation is the GitCasts and the recently published Pro Git book.
From here on I assume you understand the basics behind how a commit works, what a blob & tree object is and how git manages to always store incrementally.

Checkout the code from GitHub

As said earlier, this blog will talk about my fork of GLIP. Checkout the GLIP code as follows in a fresh directory:

> git clone git://github.com/redotheoffice/glip.git
Initialized empty Git repository in /cygdrive/c/weett/glip/glip/.git/
remote: Counting objects: 288, done.
remote: Compressing objects:  23% (3Receiving objects:  26% (75/288), 28.00 KiB
)remote: Compressing objects:  81% (134/165)                          done.
remote: Compressing objects: 100% (165/165), done.
remote: Total 288 (delta 153), reused 197 (delta 104)
Resolving deltas: 100% (153/153), done.

Reading with GLIP

GLIP works on bare repositories, which means it does only lookup files inside the .git dir. The following code will demonstrate how to read from .git repositories.

<?php
 
/* A short guide on how to use GLIP with PHP to manipulate Git Repositories */
 
// first include the library
require_once(dirname(__FILE__).'/glip/lib/glip.php');
 
// Create a link to the .git repository of GLIP, which you downloaded from
// git clone git://github.com/redotheoffice/glip.git
$repo = new Git(dirname(__FILE__).'/glip/.git');
 
 
// When using array access on a Git object, it will return a branch object, named by the key you provide
// Currently GLIP is unable to produce a list of available branches, see TODO list.
$branch = $repo['master'];  // returns GitBranch
 
 
// =========================================================================================================================
// GitCommit
// =========================================================================================================================
 
// The easiest way to work with a branch is to work with its commit tip, which is the latest commit in the branch
$commit = $branch->getTip(); // returns GitCommit
 
// Each commit object has several properties
echo "GitCommit->tree points to: ".get_class($commit->tree)." ".$commit->tree->getSha()->h(6)."n";
foreach ($commit->parents as $index => $parent)
{
  echo "GitCommit->parents[$index] points to: ".get_class($parent)." ".$parent->getSha()->h(6)."n";
}
echo "GitCommit->author:         ".$commit->author->serialize()."n";
echo "GitCommit->committer:      ".$commit->committer->serialize()."n";
echo "GitCommit->summary:        '".substr($commit->summary,0,40)."'n"; // summary is the first line of the message attached to the commit
echo "GitCommit->detail:         '".substr($commit->detail,0,40)."'n"; // detail holds all other lines of the message
// GitCommit->tree points to: GitTree fcad45
// GitCommit->parents[0] points to: GitCommit 6907e2
// GitCommit->author:         Sjoerd de Jong <sjoerd@weett.nl> 1255502607 +0700
// GitCommit->committer:      Sjoerd de Jong <sjoerd@weett.nl> 1255502607 +0700
// GitCommit->summary:        'fixed getHistory'
// GitCommit->detail:         ''
 
// You can iterate over the commit to see all files inside
$tree = null;
foreach ($commit as $name => $data)
{
  echo "$name => is a ".get_class($data)." ".$data->getSha()->h(6)."n";
 
  // this part of the code is just to get a GitTree object, to show the next feature.
  if ($data instanceof GitTree)
  {
    $tree = $data;
  }
}
// .gitignore => is a GitBlob 0bd313
// Doxyfile => is a GitBlob 6d0081
// HACKING => is a GitBlob ee36a4
// LICENSE => is a GitBlob d51190
// README => is a GitBlob 9b2bb6
// doc => is a GitTree d683aa
// lib => is a GitTree 2cf391
// test => is a GitTree a31e96
 
 
// =========================================================================================================================
// Sha objects
// =========================================================================================================================
 
// All objects inheriting from GitObject have a getSha() method
// Which exposes a small API for getting different sha values
echo "|GitObject|::getSha()->hex() = ".$commit->getSha()->hex()."n";
echo "|GitObject|::getSha()->h()   = ".$commit->getSha()->h().  "n";
echo "|GitObject|::getSha()->h(6)  = ".$commit->getSha()->h(6). "n"; // truncated hex encoding
echo "|GitObject|::getSha()->b64() = ".$commit->getSha()->b64()."n"; //binary 64 encoding
echo "|GitObject|::getSha()->bin() = ".$commit->getSha()->bin()."n"; //binary, should not be echo-ed as its binary
echo "|GitObject|::getSha()->b()   = ".$commit->getSha()->b().  "n"; //binary, should not be echo-ed as its binary
echo "|GitObject|::getSha()        = ".$commit->getSha().       "n"; //binary, should not be echo-ed as its binary
// |GitObject|::getSha()->hex() = 371107565e55fce7f2babd9acde491f6e14f2908
// |GitObject|::getSha()->h()   = 371107565e55fce7f2babd9acde491f6e14f2908
// |GitObject|::getSha()->h(6)  = 371107
// |GitObject|::getSha()->b64() = NxEHVl5V/Ofyur2azeSR9uFPKQg=
// |GitObject|::getSha()->bin() = 7◄V^Uⁿτ≥║╜Ü═Σæ÷ßO)
// |GitObject|::getSha()->b()   = 7◄V^Uⁿτ≥║╜Ü═Σæ÷ßO)
// |GitObject|::getSha()        = 7◄V^Uⁿτ≥║╜Ü═Σæ÷ßO)
 
 
// =========================================================================================================================
// GitTree
// =========================================================================================================================
 
// You can iterate over GitTree objects, to see their contents
$blob = null;
if ($tree instanceof GitTree)
{
  echo "Tree ".$tree->getSha()->h(6)." holds ".count($tree)." items.n";
 
  foreach ($tree as $name => $data)
  {
    echo "inside tree: $name => is a ".get_class($data)."n";
 
    // this part of the code is just to get a GitBlob object, to show the next feature.
    if ($data instanceof GitBlob)
    {
      $blob = $data;
    }
  }  
  // Tree a31e96 holds 3 items.
  // inside tree: bootstrap => is a GitTree
  // inside tree: readme.txt => is a GitBlob
  // inside tree: unit => is a GitTree
 
 
// =========================================================================================================================
// GitBlob
// =========================================================================================================================
 
  // You can access all properties of a GitBlob
  if ($blob instanceof GitBlob)
  {
    echo "Blob holds data: '".substr($blob->data,0,40)."...'n";
  }
  // Blob holds data: 'These tests are all written for the LIME...'
}

Writing with GLIP

Writing takes some more attention to explain. I will write somewhere the coming days another entry on writing to a repository with GLIP.

Thanks for reading!