前些天朋友遇到一个关于以太坊使用的leveldb导致的数组越界问题,一起讨论了很久。如果大家持续使用以太坊节点,迟早也会遇到此问题,在本篇文章中给大家分析一下,做好提前准备。

异常信息

我们先看一下具体的异常信息,对于普通的异常重启geth节点即可解决,但如果遇到下面这个异常信息,重启或升级版本都是无法解决的。

INFO [04-28|10:03:35] Starting peer-to-peer node               instance=Geth/v1.7.3-stable/linux-amd64/go1.9
INFO [04-28|10:03:35] Allocated cache and file handles         database=/mnt/data/eth/geth/chaindata cache=128 handles=1024
panic: runtime error: index out of range

goroutine 1 [running]:
github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb.shortenb(0x10040ff1126, 0x4, 0xc4204bf9f8)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb/util.go:30 +0x14d
github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb.(*version).computeCompaction(0xc4416120f0)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb/version.go:395 +0x4b3
github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb.(*versionStaging).finish(0xc4204bfd18, 0xc4201e8000)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb/version.go:510 +0x935
github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb.(*version).spawn(0xc420182230, 0xc4201e8000, 0xc420182230)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb/version.go:279 +0x7a
github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb.(*session).commit(0xc4201d0240, 0xc4201e8000, 0x0, 0x0)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb/session.go:195 +0x88
github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb.(*DB).recoverJournal(0xc4200ee600, 0xc4200ee600, 0xc420068660)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb/db.go:538 +0xdb8
github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb.openDB(0xc4201d0240, 0x0, 0x0, 0xc4201d0240)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb/db.go:122 +0x6ba
github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb.Open(0x185e540, 0xc4202047e0, 0xc4204c02f0, 0x0, 0x0, 0x0)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb/db.go:194 +0x100
github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb.OpenFile(0xc4201f7ba0, 0x1c, 0xc4204c02f0, 0xc4201c5bb0, 0x4, 0x4)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb/db.go:216 +0x97
github.com/ethereum/go-ethereum/ethdb.NewLDBDatabase(0xc4201f7ba0, 0x1c, 0x80, 0x400, 0x1c, 0x0, 0x0)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/ethdb/database.go:72 +0x363
github.com/ethereum/go-ethereum/node.(*ServiceContext).OpenDatabase(0xc4202a2da0, 0xf4ad6c, 0x9, 0x80, 0x400, 0x0, 0x0, 0x0, 0x0)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/node/service.go:46 +0x133
github.com/ethereum/go-ethereum/eth.CreateDB(0xc4202a2da0, 0xc4203f4800, 0xf4ad6c, 0x9, 0x0, 0x0, 0x0, 0x0)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/eth/backend.go:201 +0x5d
github.com/ethereum/go-ethereum/eth.New(0xc4202a2da0, 0xc4203f4800, 0x181d560, 0xc4204c5808, 0x417268)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/eth/backend.go:111 +0x93
github.com/ethereum/go-ethereum/cmd/utils.RegisterEthService.func2(0xc4202a2da0, 0xc4204b4420, 0xc4204c5b18, 0x0, 0xc4204b4450)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/cmd/utils/flags.go:1065 +0x3d
github.com/ethereum/go-ethereum/node.(*Node).Start(0xc4201f4480, 0x0, 0x0)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/node/node.go:175 +0x433
github.com/ethereum/go-ethereum/cmd/utils.StartNode(0xc4201f4480)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/cmd/utils/cmd.go:62 +0x2f
main.startNode(0xc420230840, 0xc4201f4480)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/cmd/geth/main.go:225 +0x43
main.geth(0xc420230840, 0xffbbe0, 0xb2d05e00)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/cmd/geth/main.go:215 +0x43
github.com/ethereum/go-ethereum/vendor/gopkg.in/urfave/cli%2ev1.HandleAction(0xdd0280, 0xffd068, 0xc420230840, 0xc420230840, 0xc4204c5f40)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/gopkg.in/urfave/cli.v1/app.go:490 +0xd2
github.com/ethereum/go-ethereum/vendor/gopkg.in/urfave/cli%2ev1.(*App).Run(0xc420250000, 0xc4200100e0, 0xe, 0xe, 0x0, 0x0)
        /mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/gopkg.in/urfave/cli.v1/app.go:264

异常分析

先查看一下异常信息的第一段代码位置:

/mnt/go/src/github.com/ethereum/go-ethereum-1.7.3/build/_workspace/src/github.com/ethereum/go-ethereum/vendor/github.com/syndtr/goleveldb/leveldb/util.go:30

进入此源代码之后,能够看到如下代码:

var bunits = [...]string{"", "Ki", "Mi", "Gi"}

func shortenb(bytes int) string {
    i := 0
    for ; bytes > 1024 && i < 4; i++ {
        bytes /= 1024
    }
    return fmt.Sprintf("%d%sB", bytes, bunits[i])
}

其中异常就发生在return代码部分,也就是通过bunits[i]获取数据时,i的值超出了bunits数组的范围。

看这段代码,当shortenb传入的bytes<1024 * 1024 * 1024是没问题的,i <= 3。但是,当bytes>1024 * 1024 * 1024 * 1024时,也就是单位到TB的时候,i的值将等于4,此时将发生数组越界异常。

为什么刚才说大家迟早会遇到这个问题呢,就是当我们同步区块链数据一开始就使用full或者很早就采用full模式的话,数据量很快会到达TB级别,而leveldb的这段代码,当到达TB级别之后就会出现数组越界异常。

问题解决方案

上面已经分析了问题的原因,那么怎么解决这个问题呢?将数组bunits再扩展一个“Ti”项?这样修改不敢打包票会修复问题,因为只是在数组里面添加一个类型,不确定其他地方是否能够使用此类型。如果要这样修改,可能需要通读相关的代码,然后测试验证才可以。

另外一种比较轻量级的改动是将for循环中i<4的判断修改为i<3,修改后的代码为:

var bunits = [...]string{"", "Ki", "Mi", "Gi"}

func shortenb(bytes int) string {
    i := 0
    for ; bytes > 1024 && i < 3; i++ {
        bytes /= 1024
    }
    return fmt.Sprintf("%d%sB", bytes, bunits[i])
}

这样再拿上面bytes>1024 * 1024 * 1024 * 1024计算一下,当单位编程TB的时候,会使用1024GB,符合原来数组的最大单位。

PS:当然,修改之后大家是需要进行相应级别数据量的测试验证的。



以太坊暂未修复的一个bug-数组越界插图

关注公众号:程序新视界,一个让你软实力、硬技术同步提升的平台

除非注明,否则均为程序新视界原创文章,转载必须以链接形式标明本文链接

本文链接:http://choupangxia.com/2019/07/06/%e4%bb%a5%e5%a4%aa%e5%9d%8a%e6%9a%82%e6%9c%aa%e4%bf%ae%e5%a4%8d%e7%9a%84%e4%b8%80%e4%b8%aabug-%e6%95%b0%e7%bb%84%e8%b6%8a%e7%95%8c/