mirror of
https://github.com/gopl-zh/gopl-zh.github.com.git
synced 2026-01-16 04:07:13 +08:00
rebuild
This commit is contained in:
@@ -8,7 +8,7 @@
|
||||
<title>示例: 解碼S表達式 | Go语言圣经</title>
|
||||
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
|
||||
<meta name="description" content="">
|
||||
<meta name="generator" content="GitBook 2.5.2">
|
||||
<meta name="generator" content="GitBook 2.6.6">
|
||||
|
||||
|
||||
<meta name="HandheldFriendly" content="true"/>
|
||||
@@ -48,7 +48,13 @@
|
||||
<body>
|
||||
|
||||
|
||||
<div class="book" data-level="12.6" data-chapter-title="示例: 解碼S表達式" data-filepath="ch12/ch12-06.md" data-basepath=".." data-revision="Thu Dec 31 2015 16:18:40 GMT+0800 (中国标准时间)">
|
||||
<div class="book"
|
||||
data-level="12.6"
|
||||
data-chapter-title="示例: 解碼S表達式"
|
||||
data-filepath="ch12/ch12-06.md"
|
||||
data-basepath=".."
|
||||
data-revision="Sat Jan 02 2016 16:00:23 GMT+0800 (中国标准时间)"
|
||||
data-innerlanguage="">
|
||||
|
||||
|
||||
<div class="book-summary">
|
||||
@@ -2024,7 +2030,140 @@
|
||||
<section class="normal" id="section-">
|
||||
|
||||
<h2 id="126-示例-解碼s表達式">12.6. 示例: 解碼S表達式</h2>
|
||||
<p>TODO</p>
|
||||
<p>標準庫中encoding/...下每個包中提供的Marshal編碼函數都有一個對應的Unmarshal函數用於解碼。例如,我們在4.5節中看到的,要將包含JSON編碼格式的字節slice數據解碼爲我們自己的Movie類型(§12.3),我們可以這樣做:</p>
|
||||
<pre><code class="lang-Go">data := []<span class="hljs-typename">byte</span>{<span class="hljs-comment">/* ... */</span>}
|
||||
<span class="hljs-keyword">var</span> movie Movie
|
||||
err := json.Unmarshal(data, &movie)
|
||||
</code></pre>
|
||||
<p>Unmarshal函數使用了反射機製類脩改movie變量的每個成員,根據輸入的內容爲Movie成員創建對應的map、結構體和slice。</p>
|
||||
<p>現在讓我們爲S表達式編碼實現一個簡易的Unmarshal,類似於前面的json.Unmarshal標準庫函數,對應我們之前實現的sexpr.Marshal函數的逆操作。我們必鬚提醒一下,一個健壯的和通用的實現通常需要比例子更多的代碼,爲了便於演示我們采用了精簡的實現。我們隻支持S表達式有限的子集,同時處理錯誤的方式也比較粗暴,代碼的目的是爲了演示反射的用法,而不是構造一個實用的S表達式的解碼器。</p>
|
||||
<p>詞法分析器lexer使用了標準庫中的text/scanner包將輸入流的字節數據解析爲一個個類似註釋、標識符、字符串面值和數字面值之類的標記。輸入掃描器scanner的Scan方法將提前掃描和返迴下一個記號,對於rune類型。大多數記號,比如“(”,對應一個單一rune可表示的Unicode字符,但是text/scanner也可以用小的負數表示記號標識符、字符串等由多個字符組成的記號。調用Scan方法將返迴這些記號的類型,接着調用TokenText方法將返迴記號對應的文本內容。</p>
|
||||
<p>因爲每個解析器可能需要多次使用當前的記號,但是Scan會一直向前掃描,所有我們包裝了一個lexer掃描器輔助類型,用於跟蹤最近由Scan方法返迴的記號。</p>
|
||||
<pre><code class="lang-Go">gopl.io/ch12/sexpr
|
||||
|
||||
<span class="hljs-keyword">type</span> lexer <span class="hljs-keyword">struct</span> {
|
||||
scan scanner.Scanner
|
||||
token <span class="hljs-typename">rune</span> <span class="hljs-comment">// the current token</span>
|
||||
}
|
||||
|
||||
<span class="hljs-keyword">func</span> (lex *lexer) next() { lex.token = lex.scan.Scan() }
|
||||
<span class="hljs-keyword">func</span> (lex *lexer) text() <span class="hljs-typename">string</span> { <span class="hljs-keyword">return</span> lex.scan.TokenText() }
|
||||
|
||||
<span class="hljs-keyword">func</span> (lex *lexer) consume(want <span class="hljs-typename">rune</span>) {
|
||||
<span class="hljs-keyword">if</span> lex.token != want { <span class="hljs-comment">// <span class="hljs-doctag">NOTE:</span> Not an example of good error handling.</span>
|
||||
<span class="hljs-built_in">panic</span>(fmt.Sprintf(<span class="hljs-string">"got %q, want %q"</span>, lex.text(), want))
|
||||
}
|
||||
lex.next()
|
||||
}
|
||||
</code></pre>
|
||||
<p>現在讓我們轉到語法解析器。它主要包含兩個功能。第一個是read函數,用於讀取S表達式的當前標記,然後根據S表達式的當前標記更新可取地址的reflect.Value對應的變量v。</p>
|
||||
<pre><code class="lang-Go"><span class="hljs-keyword">func</span> read(lex *lexer, v reflect.Value) {
|
||||
<span class="hljs-keyword">switch</span> lex.token {
|
||||
<span class="hljs-keyword">case</span> scanner.Ident:
|
||||
<span class="hljs-comment">// The only valid identifiers are</span>
|
||||
<span class="hljs-comment">// "nil" and struct field names.</span>
|
||||
<span class="hljs-keyword">if</span> lex.text() == <span class="hljs-string">"nil"</span> {
|
||||
v.Set(reflect.Zero(v.Type()))
|
||||
lex.next()
|
||||
<span class="hljs-keyword">return</span>
|
||||
}
|
||||
<span class="hljs-keyword">case</span> scanner.String:
|
||||
s, _ := strconv.Unquote(lex.text()) <span class="hljs-comment">// <span class="hljs-doctag">NOTE:</span> ignoring errors</span>
|
||||
v.SetString(s)
|
||||
lex.next()
|
||||
<span class="hljs-keyword">return</span>
|
||||
<span class="hljs-keyword">case</span> scanner.Int:
|
||||
i, _ := strconv.Atoi(lex.text()) <span class="hljs-comment">// <span class="hljs-doctag">NOTE:</span> ignoring errors</span>
|
||||
v.SetInt(<span class="hljs-typename">int64</span>(i))
|
||||
lex.next()
|
||||
<span class="hljs-keyword">return</span>
|
||||
<span class="hljs-keyword">case</span> <span class="hljs-string">'('</span>:
|
||||
lex.next()
|
||||
readList(lex, v)
|
||||
lex.next() <span class="hljs-comment">// consume ')'</span>
|
||||
<span class="hljs-keyword">return</span>
|
||||
}
|
||||
<span class="hljs-built_in">panic</span>(fmt.Sprintf(<span class="hljs-string">"unexpected token %q"</span>, lex.text()))
|
||||
}
|
||||
</code></pre>
|
||||
<p>我們的S表達式使用標識符區分兩個不同類型,結構體成員名和nil值的指針。read函數值處理nil類型的標識符。當遇到scanner.Ident爲“nil”是,使用reflect.Zero函數將變量v設置爲零值。而其它任何類型的標識符,我們都作爲錯誤處理。後面的readList函數將處理結構體的成員名。</p>
|
||||
<p>一個“(”標記對應一個列表的開始。第二個函數readList,將一個列表解碼到一個聚合類型中(map、結構體、slice或數組),具體類型依然於傳入待填充變量的類型。每次遇到這種情況,循環繼續解析每個元素直到遇到於開始標記匹配的結束標記“)”,endList函數用於檢測結束標記。</p>
|
||||
<p>最有趣的部分是遞歸。最簡單的是對數組類型的處理。直到遇到“)”結束標記,我們使用Index函數來獲取數組每個元素的地址,然後遞歸調用read函數處理。和其它錯誤類似,如果輸入數據導致解碼器的引用超出了數組的范圍,解碼器將拋出panic異常。slice也采用類似方法解析,不同的是我們將爲每個元素創建新的變量,然後將元素添加到slice的末尾。</p>
|
||||
<p>在循環處理結構體和map每個元素時必鬚解碼一個(key value)格式的對應子列表。對於結構體,key部分對於成員的名字。和數組類似,我們使用FieldByName找到結構體對應成員的變量,然後遞歸調用read函數處理。對於map,key可能是任意類型,對元素的處理方式和slice類似,我們創建一個新的變量,然後遞歸填充它,最後將新解析到的key/value對添加到map。</p>
|
||||
<pre><code class="lang-Go"><span class="hljs-keyword">func</span> readList(lex *lexer, v reflect.Value) {
|
||||
<span class="hljs-keyword">switch</span> v.Kind() {
|
||||
<span class="hljs-keyword">case</span> reflect.Array: <span class="hljs-comment">// (item ...)</span>
|
||||
<span class="hljs-keyword">for</span> i := <span class="hljs-number">0</span>; !endList(lex); i++ {
|
||||
read(lex, v.Index(i))
|
||||
}
|
||||
|
||||
<span class="hljs-keyword">case</span> reflect.Slice: <span class="hljs-comment">// (item ...)</span>
|
||||
<span class="hljs-keyword">for</span> !endList(lex) {
|
||||
item := reflect.New(v.Type().Elem()).Elem()
|
||||
read(lex, item)
|
||||
v.Set(reflect.Append(v, item))
|
||||
}
|
||||
|
||||
<span class="hljs-keyword">case</span> reflect.Struct: <span class="hljs-comment">// ((name value) ...)</span>
|
||||
<span class="hljs-keyword">for</span> !endList(lex) {
|
||||
lex.consume(<span class="hljs-string">'('</span>)
|
||||
<span class="hljs-keyword">if</span> lex.token != scanner.Ident {
|
||||
<span class="hljs-built_in">panic</span>(fmt.Sprintf(<span class="hljs-string">"got token %q, want field name"</span>, lex.text()))
|
||||
}
|
||||
name := lex.text()
|
||||
lex.next()
|
||||
read(lex, v.FieldByName(name))
|
||||
lex.consume(<span class="hljs-string">')'</span>)
|
||||
}
|
||||
|
||||
<span class="hljs-keyword">case</span> reflect.Map: <span class="hljs-comment">// ((key value) ...)</span>
|
||||
v.Set(reflect.MakeMap(v.Type()))
|
||||
<span class="hljs-keyword">for</span> !endList(lex) {
|
||||
lex.consume(<span class="hljs-string">'('</span>)
|
||||
key := reflect.New(v.Type().Key()).Elem()
|
||||
read(lex, key)
|
||||
value := reflect.New(v.Type().Elem()).Elem()
|
||||
read(lex, value)
|
||||
v.SetMapIndex(key, value)
|
||||
lex.consume(<span class="hljs-string">')'</span>)
|
||||
}
|
||||
|
||||
<span class="hljs-keyword">default</span>:
|
||||
<span class="hljs-built_in">panic</span>(fmt.Sprintf(<span class="hljs-string">"cannot decode list into %v"</span>, v.Type()))
|
||||
}
|
||||
}
|
||||
|
||||
<span class="hljs-keyword">func</span> endList(lex *lexer) <span class="hljs-typename">bool</span> {
|
||||
<span class="hljs-keyword">switch</span> lex.token {
|
||||
<span class="hljs-keyword">case</span> scanner.EOF:
|
||||
<span class="hljs-built_in">panic</span>(<span class="hljs-string">"end of file"</span>)
|
||||
<span class="hljs-keyword">case</span> <span class="hljs-string">')'</span>:
|
||||
<span class="hljs-keyword">return</span> <span class="hljs-constant">true</span>
|
||||
}
|
||||
<span class="hljs-keyword">return</span> <span class="hljs-constant">false</span>
|
||||
}
|
||||
</code></pre>
|
||||
<p>最後,我們將解析器包裝爲導出的Unmarshal解碼函數,隱藏了一些初始化和清理等邊緣處理。內部解析器以panic的方式拋出錯誤,但是Unmarshal函數通過在defer語句調用recover函數來捕獲內部panic(§5.10),然後返迴一個對panic對應的錯誤信息。</p>
|
||||
<pre><code class="lang-Go"><span class="hljs-comment">// Unmarshal parses S-expression data and populates the variable</span>
|
||||
<span class="hljs-comment">// whose address is in the non-nil pointer out.</span>
|
||||
<span class="hljs-keyword">func</span> Unmarshal(data []<span class="hljs-typename">byte</span>, out <span class="hljs-keyword">interface</span>{}) (err error) {
|
||||
lex := &lexer{scan: scanner.Scanner{Mode: scanner.GoTokens}}
|
||||
lex.scan.Init(bytes.NewReader(data))
|
||||
lex.next() <span class="hljs-comment">// get the first token</span>
|
||||
<span class="hljs-keyword">defer</span> <span class="hljs-keyword">func</span>() {
|
||||
<span class="hljs-comment">// <span class="hljs-doctag">NOTE:</span> this is not an example of ideal error handling.</span>
|
||||
<span class="hljs-keyword">if</span> x := <span class="hljs-built_in">recover</span>(); x != <span class="hljs-constant">nil</span> {
|
||||
err = fmt.Errorf(<span class="hljs-string">"error at %s: %v"</span>, lex.scan.Position, x)
|
||||
}
|
||||
}()
|
||||
read(lex, reflect.ValueOf(out).Elem())
|
||||
<span class="hljs-keyword">return</span> <span class="hljs-constant">nil</span>
|
||||
}
|
||||
</code></pre>
|
||||
<p>生産實現不應該對任何輸入問題都用panic形式報告,而且應該報告一些錯誤相關的信息,例如出現錯誤輸入的行號和位置等。盡管如此,我們希望通過這個例子來展示類似encoding/json等包底層代碼的實現思路,以及如何使用反射機製來填充數據結構。</p>
|
||||
<p><strong>練習 12.8:</strong> sexpr.Unmarshal函數和json.Marshal一樣(譯註:這可能是筆誤,我覺得應該是指<code>json.Unmarshal</code>函數),都要求在解碼前輸入完整的字節slice。定義一個和json.Decoder類似的sexpr.Decoder類型,支持從一個io.Reader流解碼。脩改sexpr.Unmarshal函數,使用這個新的類型實現。</p>
|
||||
<p><strong>練習 12.9:</strong> 編寫一個基於標記的API用於解碼S表達式,參考xml.Decoder(7.14)的風格。你將需要五種類型的標記:Symbol、String、Int、StartList和EndList。</p>
|
||||
<p><strong>練習 12.10:</strong> 擴展sexpr.Unmarshal函數,支持布爾型、浮點數和interface類型的解碼,使用 <strong>練習 12.3:</strong> 的方案。(提示:要解碼接口,你需要將name映射到每個支持類型的reflect.Type。)</p>
|
||||
|
||||
|
||||
</section>
|
||||
|
||||
Reference in New Issue
Block a user