zoukankan      html  css  js  c++  java
  • Scala for the Impatients---(9)Files and Regular Expressions

    Reading Lines

    To read all lines from a file, call the getLines method on a scala.io.Source object:

    import scala.io.Source
    val source = Source.fromFile("myfile.txt", "UTF-8")
    // The first argument can be a string or a java.io.File
    // You can omit the encoding if you know that the file uses
    // the default platform encoding
    val lineIterator = source.getLines

    The result is an iterator (see Chapter 13). You can use it to process the lines one at a time:

    for (l <- lineIterator) process l

    Or you can put the lines into an array or array buffer by applying the toArray or toBuffer method to the iterator:

    val lines = lineIterator.toArray

    Sometimes, you just want to read an entire file into a string. That’s even simpler:

    val contents = source.mkString

    Note to call close when you are done using the Source object.

    Reading Characters

    To read individual characters from a file, you can use a Source object directly as an iterator since the Source class extends Iterator[Char].

    for (c <- source) process c

    If you want to be able to peek at a character without consuming it (like istream::peek in C++ or a PushbackInputStreamReader in Java), call the buffered method on the source object. Then you can peek at the next input character with the head method without consuming it.

    val source = Source.fromFile("myfile.txt", "UTF-8")
    val iter = source.buffered
    while (iter.hasNext) {
        if (iter.head is nice )
            process iter.next
        else
            ...
    }
    source.close()

    Reading Tokens and Numbers

    Here is a quick-and-dirty way of reading all whitespace-separated tokens in a source.

    val tokens = source.mkString.split("\s+")

    if you have a file containing floating-point numbers, you can read them all into an array by

    val numbers = for (w <- tokens) yield w.toDouble

    or 

    val numbers = tokens.map(_.toDouble)

    Remember you can always use the java.util.Scanner class to process a file that contains a mixture of text and numbers.

    Reading from URLs and Other Sources

    val source1 = Source.fromURL("http://horstmann.com", "UTF-8")
    val source2 = Source.fromString("Hello, World!")
    // Reads from the given string—useful for debugging
    val source3 = Source.stdin
    // Reads from standard input

    Reading Binary Files

    Scala has no provision for reading binary files. You’ll need to use the Java library. Here is how you can read a file into a byte array:

    val file = new File(filename)
    val in = new FileInputStream(file)
    val bytes = new Array[Byte](file.length.toInt)
    in.read(bytes)
    in.close()

    Writing Text Files

    Scala has no built-in support for writing files. To write a text file, use a java.io.PrintWriter , for example:

    val out = new PrintWriter("numbers.txt")
    for (i <- 1 to 100) out.println(i)
    out.close()

    Visiting Directories

    There are currently no “official” Scala classes for visiting all files in a directory, or for recursively traversing directories. In this section, we discuss a couple of alternatives.

    It is simple to write a function that produces an iterator through all subdirectories of a directory:

    import java.io.File
    def subdirs(dir: File): Iterator[File] = {
        val children = dir.listFiles.filter(_.isDirectory)
        children.toIterator ++ children.toIterator.flatMap(subdirs _)
    }

    With this function, you can visit all subdirectories like this:

    for (d <- subdirs(dir)) process d

    Alternatively, if you use Java 7, you can adapt the walkFileTree method of the java.nio.file.Files class. That class makes use of a FileVisitor interface. In Scala, we generally prefer to use function objects, not interfaces, for specifying work (even though in this case the interface allows more fine-grained control—see the Javadoc for details). The following implicit conversion adapts a function to the interface:

    import java.nio.file._
    implicit def makeFileVisitor(f: (Path) => Unit) = new SimpleFileVisitor[Path] {
        override def visitFile(p: Path, attrs: attribute.BasicFileAttributes) = {
            f(p)
            FileVisitResult.CONTINUE
        }
    }

    Then you can print all subdirectories with the call

    Files.walkFileTree(dir.toPath, (f: Path) => println(f) )

    Of course, if you don’t just want to print the files, you can specify other actions in the function that you pass to the walkFileTree method.

    Serialization

    Here is how you declare a serializable class in Java and Scala.

    Java:

    public class Person implements java.io.Serializable {
        private static final long serialVersionUID = 42L;
        ...
    }

    Scala:

    @SerialVersionUID(42L) class Person extends Serializable

    The Serializable trait is defined in the scala package and does not require an import. You can omit the @SerialVersionUID annotation if you are OK with the default ID.

    You serialize and deserialize objects in the usual way:

    val fred = new Person(...)
    import java.io._
    val out = new ObjectOutputStream(new FileOutputStream("/tmp/test.obj"))
    out.writeObject(fred)
    out.close()
    val in = new ObjectInputStream(new FileInputStream("/tmp/test.obj"))
    val savedFred = in.readObject().asInstanceOf[Person]

    The Scala collections are serializable, so you can have them as members of your serializable classes:

    class Person extends Serializable {
    private val friends = new ArrayBuffer[Person] // OK—ArrayBuffer is serializable
    ..
    }

    Process Control

    The scala.sys.process package provides utilities to interact with shell programs. You can write your shell scripts in Scala, with all the power that the Scala language puts at your disposal. Here is a simple example:

    import sys.process._
    "ls -al .." !

    As a result, the ls -al .. command is executed, showing all files in the parent directory. The result is printed to standard output. The sys.process package contains an implicit conversion from strings to ProcessBuilder objects. The ! operator executes the ProcessBuilder object.

    If you use !! instead of ! , the output is returned as a string:

    val result = "ls -al .." !!

    You can pipe the output of one program into the input of another, using the #| operator:

    "ls -al .." #| "grep sec" !

    As you can see, the process library uses the commands of the underlying operating system. Here, I use bash commands because bash is available on Linux, Mac OS X, and Windows.

    To redirect the output to a file, use the #> operator:

    "ls -al .." #> new File("output.txt") !

    To append to a file, use #>> instead:

    "ls -al .." #>> new File("output.txt") !

    To redirect input from a file, use #< :

    "grep sec" #< new File("output.txt") !

    You can also redirect input from a URL:

    "grep Scala" #< new URL("http://horstmann.com/index.html") !

    You can combine processes with p #&& q (execute q if p was successful) and p #|| q (execute q if p was unsuccessful). But frankly, Scala is better at control flow than the shell, so why not implement the control flow in Scala?

    Regular Expressions

    When you process input, you often want to use regular expressions to analyze it. The scala.util.matching.Regex class makes this simple. To construct a Regex instance, use the r method of the String class:

    val numPattern = "[0-9]+".r

    If the regular expression contains backslashes or quotation marks, then it is a good idea to use the “raw” string syntax, """...""" .

    val wsnumwsPattern = """s+[0-9]+s+""".r
    // A bit easier to read than "\s+[0-9]+\s+".r

    The findAllIn method returns an iterator through all matches. You can use it in a for loop:

    for (matchString <- numPattern.findAllIn("99 bottles, 98 bottles"))
    process matchString

    or turn the iterator into an array:

    val matches = numPattern.findAllIn("99 bottles, 98 bottles").toArray
    // Array(99, 98)

    To find the first match anywhere in a string, use findFirstIn . You get an Option[String] . (See Chapter 14 for the Option class.)

    val m1 = wsnumwsPattern.findFirstIn("99 bottles, 98 bottles")
    // Some(" 98 ")

    To check whether the beginning of a string matches, use findPrefixOf :

    numPattern.findPrefixOf("99 bottles, 98 bottles")
    // Some(99)
    wsnumwsPattern.findPrefixOf("99 bottles, 98 bottles")
    // None

    You can replace the first match, or all matches:

    numPattern.replaceFirstIn("99 bottles, 98 bottles", "XX")
    // "XX bottles, 98 bottles"
    numPattern.replaceAllIn("99 bottles, 98 bottles", "XX")
    // "XX bottles, XX bottles"

    Regular Expression Groups

    Groups are useful to get subexpressions of regular expressions. Add parentheses around the subexpressions that you want to extract, for example:

    val numitemPattern = "([0-9]+) ([a-z]+)".r

    To match the groups, use the regular expression object as an “extractor” (see Chapter 14), like this:

    val numitemPattern(num, item) = "99 bottles"
    // Sets num to "99", item to "bottles"

    If you want to extract groups from multiple matches, use a for statement like this:

    for (numitemPattern(num, item) <- numitemPattern.findAllIn("99 bottles, 98 bottles"))
    process num and item
  • 相关阅读:
    go test 下篇
    go test 上篇
    利用Docker Compose快速搭建本地测试环境
    phinx:php数据库迁移
    tp5 r3 一个简单的SQL语句调试实例
    TP开发小技巧
    优酷真实视频地址解析——2014年10月7日
    霍夫变换
    Google Earth影像数据破解之旅
    线程理论:(四)锁优化
  • 原文地址:https://www.cnblogs.com/chaseblack/p/5878222.html
Copyright © 2011-2022 走看看