Golang讀取文件內(nèi)容的方法

這篇文章主要講解了Golang讀取文件內(nèi)容的方法，內(nèi)容清晰明了，對此有興趣的小伙伴可以學習一下，相信大家閱讀完之后會有幫助。

成都創(chuàng)新互聯(lián)專業(yè)為企業(yè)提供雙陽網(wǎng)站建設(shè)、雙陽做網(wǎng)站、雙陽網(wǎng)站設(shè)計、雙陽網(wǎng)站制作等企業(yè)網(wǎng)站建設(shè)、網(wǎng)頁設(shè)計與制作、雙陽企業(yè)網(wǎng)站模板建站服務(wù)，十載雙陽做網(wǎng)站經(jīng)驗，不只是建網(wǎng)站，更提供有價值的思路和整體網(wǎng)絡(luò)服務(wù)。

本文旨在快速介紹Go標準庫中讀取文件的許多選項。

在Go中（就此而言，大多數(shù)底層語言和某些動態(tài)語言（如Node））返回字節(jié)流。不將所有內(nèi)容自動轉(zhuǎn)換為字符串的好處是，其中之一是避免昂貴的字符串分配，這會增加GC壓力。

為了使本文更加簡單，我將使用string(arrayOfBytes)將bytes數(shù)組轉(zhuǎn)換為字符串。但是，在發(fā)布生產(chǎn)代碼時，不應(yīng)將其作為一般建議。

1.讀取整個文件到內(nèi)存中

首先，標準庫提供了多種功能和實用程序來讀取文件數(shù)據(jù)。我們將從os軟件包中提供的基本情況開始。這意味著兩個先決條件：

該文件必須容納在內(nèi)存中
我們需要預(yù)先知道文件的大小，以便實例化一個足以容納它的緩沖區(qū)。

有了os.File對象的句柄，我們可以查詢大小并實例化一個字節(jié)列表。

package main


import (
 "os"
 "fmt"
)
func main() {
 file, err := os.Open("filetoread.txt")
 if err != nil {
 fmt.Println(err)
 return
 }
 defer file.Close()

 fileinfo, err := file.Stat()
 if err != nil {
 fmt.Println(err)
 return
 }

 filesize := fileinfo.Size()
 buffer := make([]byte, filesize)

 bytesread, err := file.Read(buffer)
 if err != nil {
 fmt.Println(err)
 return
 }
 fmt.Println("bytes read: ", bytesread)
 fmt.Println("bytestream to string: ", string(buffer))
}

2.以塊的形式讀取文件

雖然大多數(shù)情況下可以一次讀取文件，但有時我們還是想使用一種更加節(jié)省內(nèi)存的方法。例如，以某種大小的塊讀取文件，處理它們，并重復(fù)直到結(jié)束。在下面的示例中，使用的緩沖區(qū)大小為100字節(jié)。

package main


import (
 "io"
 "os"
 "fmt"
)

const BufferSize = 100

func main() {
 
 file, err := os.Open("filetoread.txt")
 if err != nil {
 fmt.Println(err)
 return
 }
 defer file.Close()

 buffer := make([]byte, BufferSize)

 for {
 bytesread, err := file.Read(buffer)
 if err != nil {
  if err != io.EOF {
  fmt.Println(err)
  }
  break
 }
 fmt.Println("bytes read: ", bytesread)
 fmt.Println("bytestream to string: ", string(buffer[:bytesread]))
 }
}

與完全讀取文件相比，主要區(qū)別在于：

讀取直到獲得EOF標記，因此我們?yōu)閑rr == io.EOF添加了特定檢查
我們定義了緩沖區(qū)的大小，因此我們可以控制所需的“塊”大小。如果操作系統(tǒng)正確地將正在讀取的文件緩存起來，則可以在正確使用時提高性能。
如果文件大小不是緩沖區(qū)大小的整數(shù)倍，則最后一次迭代將僅將剩余字節(jié)數(shù)添加到緩沖區(qū)中，因此調(diào)用buffer [：bytesread]。在正常情況下，bytesread將與緩沖區(qū)大小相同。

對于循環(huán)的每次迭代，都會更新內(nèi)部文件指針。下次讀取時，將返回從文件指針偏移開始直到緩沖區(qū)大小的數(shù)據(jù)。該指針不是語言的構(gòu)造，而是操作系統(tǒng)之一。在Linux上，此指針是要創(chuàng)建的文件描述符的屬性。所有的read / Read調(diào)用（分別在Ruby / Go中）在內(nèi)部都轉(zhuǎn)換為系統(tǒng)調(diào)用并發(fā)送到內(nèi)核，并且內(nèi)核管理此指針。

3.并發(fā)讀取文件塊

如果我們想加快對上述塊的處理，該怎么辦？一種方法是使用多個go例程！與串行讀取塊相比，我們需要做的另一項工作是我們需要知道每個例程的偏移量。請注意，當目標緩沖區(qū)的大小大于剩余的字節(jié)數(shù)時，ReadAt的行為與Read的行為略有不同。

另請注意，我并沒有限制goroutine的數(shù)量，它僅由緩沖區(qū)大小來定義。實際上，此數(shù)字可能會有上限。

package main

import (
 "fmt"
 "os"
 "sync"
)

const BufferSize = 100

type chunk struct {
 bufsize int
 offset int64
}

func main() {
 
 file, err := os.Open("filetoread.txt")
 if err != nil {
 fmt.Println(err)
 return
 }
 defer file.Close()

 fileinfo, err := file.Stat()
 if err != nil {
 fmt.Println(err)
 return
 }

 filesize := int(fileinfo.Size())
 // Number of go routines we need to spawn.
 concurrency := filesize / BufferSize
 // buffer sizes that each of the go routine below should use. ReadAt
 // returns an error if the buffer size is larger than the bytes returned
 // from the file.
 chunksizes := make([]chunk, concurrency)

 // All buffer sizes are the same in the normal case. Offsets depend on the
 // index. Second go routine should start at 100, for example, given our
 // buffer size of 100.
 for i := 0; i < concurrency; i++ {
 chunksizes[i].bufsize = BufferSize
 chunksizes[i].offset = int64(BufferSize * i)
 }

 // check for any left over bytes. Add the residual number of bytes as the
 // the last chunk size.
 if remainder := filesize % BufferSize; remainder != 0 {
 c := chunk{bufsize: remainder, offset: int64(concurrency * BufferSize)}
 concurrency++
 chunksizes = append(chunksizes, c)
 }

 var wg sync.WaitGroup
 wg.Add(concurrency)

 for i := 0; i < concurrency; i++ {
 go func(chunksizes []chunk, i int) {
  defer wg.Done()

  chunk := chunksizes[i]
  buffer := make([]byte, chunk.bufsize)
  bytesread, err := file.ReadAt(buffer, chunk.offset)

  if err != nil {
  fmt.Println(err)
  return
  }

  fmt.Println("bytes read, string(bytestream): ", bytesread)
  fmt.Println("bytestream to string: ", string(buffer))
 }(chunksizes, i)
 }

 wg.Wait()
}

與以前的任何方法相比，這種方法要多得多：

我正在嘗試創(chuàng)建特定數(shù)量的Go例程，具體取決于文件大小和緩沖區(qū)大小（在本例中為100）。
我們需要一種方法來確保我們正在“等待”所有執(zhí)行例程。在此示例中，我使用的是wait group。
在每個例程結(jié)束的時候，從內(nèi)部發(fā)出信號，而不是break for循環(huán)。因為我們延時調(diào)用了wg.Done(),所以在每個例程返回的時候才調(diào)用它。

注意：始終檢查返回的字節(jié)數(shù)，并重新分配輸出緩沖區(qū)。

使用Read()讀取文件可以走很長一段路，但是有時您需要更多的便利。Ruby中經(jīng)常使用的是IO函數(shù)，例如each_line,each_char, each_codepoint 等等.通過使用Scanner類型以及bufio軟件包中的關(guān)聯(lián)函數(shù)，我們可以實現(xiàn)類似的目的。

bufio.Scanner類型實現(xiàn)帶有“ split”功能的函數(shù)，并基于該功能前進指針。例如，對于每個迭代，內(nèi)置的bufio.ScanLines拆分函數(shù)都會使指針前進，直到下一個換行符為止.

在每個步驟中，該類型還公開用于獲取開始位置和結(jié)束位置之間的字節(jié)數(shù)組/字符串的方法。

package main

import (
 "fmt"
 "os"
 "bufio"
)

const BufferSize = 100

type chunk struct {
 bufsize int
 offset int64
}

func main() {
 file, err := os.Open("filetoread.txt")
 if err != nil {
 fmt.Println(err)
 return
 }
 defer file.Close()
 scanner := bufio.NewScanner(file)
 scanner.Split(bufio.ScanLines)

 // Returns a boolean based on whether there's a next instance of `\n`
 // character in the IO stream. This step also advances the internal pointer
 // to the next position (after '\n') if it did find that token.
 for {
 read := scanner.Scan()
 if !read {
  break
  
 }
 fmt.Println("read byte array: ", scanner.Bytes())
 fmt.Println("read string: ", scanner.Text())
 }
 
}

因此，要以這種方式逐行讀取整個文件，可以使用如下所示的內(nèi)容：

package main

import (
 "bufio"
 "fmt"
 "os"
)

func main() {
 file, err := os.Open("filetoread.txt")
 if err != nil {
 fmt.Println(err)
 return
 }
 defer file.Close()

 scanner := bufio.NewScanner(file)
 scanner.Split(bufio.ScanLines)

 // This is our buffer now
 var lines []string

 for scanner.Scan() {
 lines = append(lines, scanner.Text())
 }

 fmt.Println("read lines:")
 for _, line := range lines {
 fmt.Println(line)
 }
}

4.逐字掃描

bufio軟件包包含基本的預(yù)定義拆分功能：

ScanLines (默認)
ScanWords
ScanRunes(對于遍歷UTF-8代碼點（而不是字節(jié)）非常有用)
ScanBytes

因此，要讀取文件并在文件中創(chuàng)建單詞列表，可以使用如下所示的內(nèi)容：

package main

import (
 "bufio"
 "fmt"
 "os"
)

func main() {
 file, err := os.Open("filetoread.txt")
 if err != nil {
 fmt.Println(err)
 return
 }
 defer file.Close()

 scanner := bufio.NewScanner(file)
 scanner.Split(bufio.ScanWords)

 var words []string

 for scanner.Scan() {
 words = append(words, scanner.Text())
 }

 fmt.Println("word list:")
 for _, word := range words {
 fmt.Println(word)
 }
}

ScanBytes拆分函數(shù)將提供與早期Read()示例相同的輸出。兩者之間的主要區(qū)別是在掃描程序中，每次需要附加到字節(jié)/字符串數(shù)組時，動態(tài)分配問題。可以通過諸如將緩沖區(qū)預(yù)初始化為特定長度的技術(shù)來避免這種情況，并且只有在達到前一個限制時才增加大小。使用與上述相同的示例：

package main

import (
 "bufio"
 "fmt"
 "os"
)

func main() {
 file, err := os.Open("filetoread.txt")
 if err != nil {
 fmt.Println(err)
 return
 }
 defer file.Close()

 scanner := bufio.NewScanner(file)
 scanner.Split(bufio.ScanWords)

 // initial size of our wordlist
 bufferSize := 50
 words := make([]string, bufferSize)
 pos := 0

 for scanner.Scan() {
 if err := scanner.Err(); err != nil {
  // This error is a non-EOF error. End the iteration if we encounter
  // an error
  fmt.Println(err)
  break
 }

 words[pos] = scanner.Text()
 pos++

 if pos >= len(words) {
  // expand the buffer by 100 again
  newbuf := make([]string, bufferSize)
  words = append(words, newbuf...)
 }
 }

 fmt.Println("word list:")
 // we are iterating only until the value of "pos" because our buffer size
 // might be more than the number of words because we increase the length by
 // a constant value. Or the scanner loop might've terminated due to an
 // error prematurely. In this case the "pos" contains the index of the last
 // successful update.
 for _, word := range words[:pos] {
 fmt.Println(word)
 }
}

因此，我們最終要進行的切片“增長”操作要少得多，但最終可能要根據(jù)緩沖區(qū)大小和文件中的單詞數(shù)在結(jié)尾處留出一些空插槽，這是一個折衷方案。

5.將長字符串拆分為單詞

bufio.NewScanner使用滿足io.Reader接口的類型作為參數(shù)，這意味著它將與定義了Read方法的任何類型一起使用。
標準庫中返回reader類型的string實用程序方法之一是strings.NewReader函數(shù)。當從字符串中讀取單詞時，我們可以將兩者結(jié)合起來：

package main

import (
 "bufio"
 "fmt"
 "strings"
)

func main() {
 longstring := "This is a very long string. Not."
 var words []string
 scanner := bufio.NewScanner(strings.NewReader(longstring))
 scanner.Split(bufio.ScanWords)

 for scanner.Scan() {
 words = append(words, scanner.Text())
 }

 fmt.Println("word list:")
 for _, word := range words {
 fmt.Println(word)
 }
}

6.掃描以逗號分隔的字符串

手動解析CSV文件/字符串通過基本的file.Read()或者Scanner類型是復(fù)雜的。因為根據(jù)拆分功能bufio.ScanWords，“單詞”被定義為一串由unicode空間界定的符文。讀取各個符文并跟蹤緩沖區(qū)的大小和位置（例如在詞法分析中所做的工作）是太多的工作和操作。

但這可以避免。我們可以定義一個新的拆分函數(shù)，該函數(shù)讀取字符直到讀者遇到逗號，然后在調(diào)用Text（）或Bytes（）時返回該塊。bufio.SplitFunc函數(shù)的函數(shù)簽名如下所示：

type SplitFunc func(data []byte, atEOF bool) (advance int, token []byte, err error)

為簡單起見，我展示了一個讀取字符串而不是文件的示例。使用上述簽名的CSV字符串的簡單閱讀器可以是：

package main

import (
 "bufio"
 "bytes"
 "fmt"
 "strings"
)

func main() {
 csvstring := "name, age, occupation"

 // An anonymous function declaration to avoid repeating main()
 ScanCSV := func(data []byte, atEOF bool) (advance int, token []byte, err error) {
 commaidx := bytes.IndexByte(data, ',')
 if commaidx > 0 {
  // we need to return the next position
  buffer := data[:commaidx]
  return commaidx + 1, bytes.TrimSpace(buffer), nil
 }

 // if we are at the end of the string, just return the entire buffer
 if atEOF {
  // but only do that when there is some data. If not, this might mean
  // that we've reached the end of our input CSV string
  if len(data) > 0 {
  return len(data), bytes.TrimSpace(data), nil
  }
 }

 // when 0, nil, nil is returned, this is a signal to the interface to read
 // more data in from the input reader. In this case, this input is our
 // string reader and this pretty much will never occur.
 return 0, nil, nil
 }

 scanner := bufio.NewScanner(strings.NewReader(csvstring))
 scanner.Split(ScanCSV)

 for scanner.Scan() {
 fmt.Println(scanner.Text())
 }
}

7.ioutil

我們已經(jīng)看到了多種讀取文件的方式.但是，如果您只想將文件讀入緩沖區(qū)怎么辦？
ioutil是標準庫中的軟件包，其中包含一些使它成為單行的功能。

讀取整個文件

package main

import (
 "io/ioutil"
 "log"
 "fmt"
)

func main() {
 bytes, err := ioutil.ReadFile("filetoread.txt")
 if err != nil {
 log.Fatal(err)
 }

 fmt.Println("Bytes read: ", len(bytes))
 fmt.Println("String read: ", string(bytes))
}

這更接近我們在高級腳本語言中看到的內(nèi)容。

讀取文件的整個目錄

不用說，如果您有大文件，請不要運行此腳本

package main

import (
 "io/ioutil"
 "log"
 "fmt"
)

func main() {
 filelist, err := ioutil.ReadDir(".")
 if err != nil {
 log.Fatal(err)
 }
 for _, fileinfo := range filelist {
 if fileinfo.Mode().IsRegular() {
  bytes, err := ioutil.ReadFile(fileinfo.Name())
  if err != nil {
  log.Fatal(err)
  }
  fmt.Println("Bytes read: ", len(bytes))
  fmt.Println("String read: ", string(bytes))
 }
 }
}

看完上述內(nèi)容，是不是對Golang讀取文件內(nèi)容的方法有進一步的了解，如果還想學習更多內(nèi)容，歡迎關(guān)注創(chuàng)新互聯(lián)行業(yè)資訊頻道。

當前文章：Golang讀取文件內(nèi)容的方法
文章網(wǎng)址：http://chinadenli.net/article30/ppsdpo.html

成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián)，為您提供電子商務(wù)、網(wǎng)站制作、App設(shè)計、移動網(wǎng)站建設(shè)、企業(yè)建站、域名注冊

聲明：本網(wǎng)站發(fā)布的內(nèi)容（圖片、視頻和文字）以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主，如果涉及侵權(quán)請盡快告知，我們將會在第一時間刪除。文章觀點不代表本網(wǎng)站立場，如需處理請聯(lián)系客服。電話：028-86922220；郵箱：631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載，或轉(zhuǎn)載時需注明來源：創(chuàng)新互聯(lián)

猜你還喜歡下面的內(nèi)容

欧美一区二区三区老妇人-欧美做爰猛烈大尺度电-99久久夜色精品国产亚洲a-亚洲福利视频一区二区

Golang讀取文件內(nèi)容的方法