I am trying to read the assocated Doc comments on a struct type using Go’s parser and ast packages. In this example, the code simply uses itself as the source.
You need to use the go/doc
package to extract documentation from the ast:
package main
import (
"fmt"
"go/doc"
"go/parser"
"go/token"
)
// FirstType docs
type FirstType struct {
// FirstMember docs
FirstMember string
}
// SecondType docs
type SecondType struct {
// SecondMember docs
SecondMember string
}
// Main docs
func main() {
fset := token.NewFileSet() // positions are relative to fset
d, err := parser.ParseDir(fset, "./", nil, parser.ParseComments)
if err != nil {
fmt.Println(err)
return
}
for k, f := range d {
fmt.Println("package", k)
p := doc.New(f, "./", 0)
for _, t := range p.Types {
fmt.Println(" type", t.Name)
fmt.Println(" docs:", t.Doc)
}
}
}
Looking at the source code of go/doc
, we can see that it has to deal with this same case in readType
function. There, it says:
324 func (r *reader) readType(decl *ast.GenDecl, spec *ast.TypeSpec) {
...
334 // compute documentation
335 doc := spec.Doc
336 spec.Doc = nil // doc consumed - remove from AST
337 if doc == nil {
338 // no doc associated with the spec, use the declaration doc, if any
339 doc = decl.Doc
340 }
...
Notice in particular how it needs to deal with the case where the AST does not have a doc attached to the TypeSpec. To do this, it falls back on the GenDecl
. This gives us a clue as to how we might use the AST directly to parse doc comments for structs. Adapting the for loop in the question code to add a case for *ast.GenDecl
:
for _, f := range d {
ast.Inspect(f, func(n ast.Node) bool {
switch x := n.(type) {
case *ast.FuncDecl:
fmt.Printf("%s:\tFuncDecl %s\t%s\n", fset.Position(n.Pos()), x.Name, x.Doc.Text())
case *ast.TypeSpec:
fmt.Printf("%s:\tTypeSpec %s\t%s\n", fset.Position(n.Pos()), x.Name, x.Doc.Text())
case *ast.Field:
fmt.Printf("%s:\tField %s\t%s\n", fset.Position(n.Pos()), x.Names, x.Doc.Text())
case *ast.GenDecl:
fmt.Printf("%s:\tGenDecl %s\n", fset.Position(n.Pos()), x.Doc.Text())
}
return true
})
}
Running this gives us:
main.go:3:1: GenDecl %!s(*ast.CommentGroup=<nil>)
main.go:11:1: GenDecl &{[%!s(*ast.Comment=&{69 // FirstType docs})]}
main.go:11:6: TypeSpec FirstType %!s(*ast.CommentGroup=<nil>)
main.go:13:2: Field [FirstMember] &{[%!s(*ast.Comment=&{112 // FirstMember docs})]}
main.go:17:1: GenDecl &{[%!s(*ast.Comment=&{155 // SecondType docs})]}
main.go:17:6: TypeSpec SecondType %!s(*ast.CommentGroup=<nil>)
main.go:19:2: Field [SecondMember] &{[%!s(*ast.Comment=&{200 // SecondMember docs})]}
main.go:23:1: FuncDecl main &{[%!s(*ast.Comment=&{245 // Main docs})]}
main.go:33:23: Field [n] %!s(*ast.CommentGroup=<nil>)
main.go:33:35: Field [] %!s(*ast.CommentGroup=<nil>)
We've printed out the long-lost FirstType docs
and SecondType docs
! But this is unsatisfactory. Why is the doc not attached to the TypeSpec
? The go/doc/reader.go
file goes to extraordinary lengths to circumvent this issue, actually generating a fake GenDecl
and passing it to the readType
function mentioned earlier, if there is no documentation associated with the struct declaration!
503 fake := &ast.GenDecl{
504 Doc: d.Doc,
505 // don't use the existing TokPos because it
506 // will lead to the wrong selection range for
507 // the fake declaration if there are more
508 // than one type in the group (this affects
509 // src/cmd/godoc/godoc.go's posLink_urlFunc)
510 TokPos: s.Pos(),
511 Tok: token.TYPE,
512 Specs: []ast.Spec{s},
513 }
Imagine we changed the type definitions from code in the question slightly (defining structs like this is not common, but still valid Go):
// This documents FirstType and SecondType together
type (
// FirstType docs
FirstType struct {
// FirstMember docs
FirstMember string
}
// SecondType docs
SecondType struct {
// SecondMember docs
SecondMember string
}
)
Run the code (including the case for ast.GenDecl
) and we get:
main.go:3:1: GenDecl %!s(*ast.CommentGroup=<nil>)
main.go:11:1: GenDecl &{[%!s(*ast.Comment=&{69 // This documents FirstType and SecondType together})]}
main.go:13:2: TypeSpec FirstType &{[%!s(*ast.Comment=&{129 // FirstType docs})]}
main.go:15:3: Field [FirstMember] &{[%!s(*ast.Comment=&{169 // FirstMember docs})]}
main.go:19:2: TypeSpec SecondType &{[%!s(*ast.Comment=&{215 // SecondType docs})]}
main.go:21:3: Field [SecondMember] &{[%!s(*ast.Comment=&{257 // SecondMember docs})]}
main.go:26:1: FuncDecl main &{[%!s(*ast.Comment=&{306 // Main docs})]}
main.go:36:23: Field [n] %!s(*ast.CommentGroup=<nil>)
main.go:36:35: Field [] %!s(*ast.CommentGroup=<nil>)
Now the struct type definitions have their docs, and the GenDecl
has its own documentation, too. In the first case, posted in the question, the doc was attached to GenDecl
, since the AST sees the individual struct type definitions of "contractions" of the parenthesized-version of type definitions, and wants to handle all definitions the same, whether they are grouped or not. The same thing would happen with variable definitions, as in:
// some general docs
var (
// v docs
v int
// v2 docs
v2 string
)
So if you wish to parse comments with pure AST, you need to be aware that this is how it works. But the preferred method, as @mjibson suggested, is to use go/doc
. Good luck!