zoukankan      html  css  js  c++  java
  • shell of leetcode

    1.Tenth Line

    How would you print just the 10th line of a file?

    For example, assume that file.txt has the following content:
    Line 1
    Line 2
    Line 3
    Line 4
    Line 5
    Line 6
    Line 7
    Line 8
    Line 9
    Line 10

    Your script should output the tenth line, which is:
    Line 10
    -------------------

    # Read from the file file.txt and output the tenth line to stdout.
    
    #Solution One:
    #head -n 10 file.txt | tail -n +10
    
    #Solution Two:
    #awk 'NR==10' file.txt
    
    #Solution Three:
    sed -n 10p file.txt
    

    涉及知识点:

    ->head 用来显示档案的开头至标准输出中,默认head命令打印其相应文件的开头10行。 

    语法格式:head [参数]... [文件]... 

    命令参数:

    -q 隐藏文件名

    -v 显示文件名

    -c<字节> 显示字节数

    -n<行数> 显示的行数

    ->tail命令用于显示指定文件末尾内容,不指定文件时,作为输入信息进行处理。常用查看日志文件。

    命令参数:

    -f 循环读取

    -q 不显示处理信息

    -v 显示详细的处理信息

    -c<数目> 显示的字节数

    -n<行数> 显示行数

    --pid=PID 与-f合用,表示在进程ID,PID死掉之后结束. 

    -q, --quiet, --silent 从不输出给出文件名的首部 

    -s, --sleep-interval=S 与-f合用,表示在每次反复的间隔休眠S秒 

    可参考:我使用过的Linux命令之tail - 输出文件尾部/动态监视文件尾部

    ->awk是一个强大的文本分析工具,相对于grep的查找,sed的编辑,awk在其对数据分析并生成报告时,显得尤为强大。简单来说awk就是把文件逐行的读入,以空格为默认分隔符将每行切片,切开的部分再进行各种分析处理。

    语法格式:

    awk '{pattern + action}' {filenames}
    

    pattern 表示 AWK 在数据中查找的内容,而 action 是在找到匹配内容时所执行的一系列命令  

    可参考:linux awk命令详解

    ->sed 是一种在线编辑器,它一次处理一行内容。处理时,把当前处理的行存储在临时缓冲区中,称为“模式空间”(pattern space),接着用sed命令处理缓冲区中的内容,处理完成后,把缓冲区的内容送往屏幕。接着处理下一行,这样不断重复,直到文件末尾。

    语法格式:

    sed [-hnV][-e<script>][-f<script文件>][文本文件]
    

    2.Transpose File
    Given a text file file.txt, transpose its content.

    You may assume that each row has the same number of columns and each field is separated by the ' ' character.

    For example, if file.txt has the following content:

    name age
    alice 21
    ryan 30
    Output the following:

    name alice ryan
    age 21 30

    ---------

    # Read from the file file.txt and print its transposed content to stdout.
    # using awk for this purpose
    awk '
        {
            for(i=1; i<=NF; i++)
            {   
                if(line[i] == "")
                {
                    line[i] = $i
                }
                else
                {
                    line[i] = line[i]" "$i
                }
            }
        }
        END{
             for(i=1; i<=NF; i++)
             {
                 print line[i]
             }
           }
        ' file.txt
    

    如果The number of columns is  two.则可以用以下方法:

    test2

    name age
    alice 21
    ryan 30
    

    solution:

    MindeMacBook-Pro:闲杂笔记 minzhu$ cut -d " " -f1 test2 |xargs
    name alice ryan
    MindeMacBook-Pro:闲杂笔记 minzhu$ cut -d " " -f2 test2 |xargs
    age 21 30
    

      

    3.Valid Phone Numbers

    Given a text file file.txt that contains list of phone numbers (one per line), write a one liner bash script to print all valid phone numbers.

    You may assume that a valid phone number must appear in one of the following two formats: (xxx) xxx-xxxx or xxx-xxx-xxxx. (x means a digit)

    You may also assume each line in the text file must not contain leading or trailing white spaces.

    For example, assume that file.txt has the following content:

    987-123-4567
    123 456 7890
    (123) 456-7890
    

    Your script should output the following valid phone numbers:

    987-123-4567
    (123) 456-7890

    ------------

    file.txt

    987-123-4567
    123 456 7890
    (123) 456-7890
    

    solution1:

    grep -e '(^[0-9]{3}-[0-9]{3}-[0-9]{4}$)' -e '(^([0-9]{3})[ ]{1}[0-9]{3}-([0-9]{4})$)'  file.txt
    

    explanation:

    1. In Bash, we use  to escape next one trailing character;
    2. ^ is used to denote the beginning of a line
    3. $ is used to denote the end of a line
    4. {M} is used to denote to match exactly M times of the previous occurence/regex
    5. (...) is used to group pattern/regex together

    Back to this problem: it requires us to match two patterns, for better readability, I used -e and separate the two patterns into two regexes, the first one matches this case: xxx-xxx-xxxx and the second one matches this case: (xxx) xxx-xxxx

    solution2:

    awk < file.txt '/^[0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]$/ || /^([0-9][0-9][0-9]) [0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]$/ {print}'
    

    The format for 'awk':
    awk < file 'pattern {action}'
    or
    awk 'pattern {action}' file

    Note: 'print' action without any arguments means print out the whole line.

    4.Word Frequency

    Write a bash script to calculate the frequency of each word in a text file words.txt.

    For simplicity sake, you may assume:

    • words.txt contains only lowercase characters and space ' ' characters.
    • Each word must consist of lowercase characters only.
    • Words are separated by one or more whitespace characters.

    For example, assume that words.txt has the following content:

    the day is sunny the the
    the sunny is is
    

    Your script should output the following, sorted by descending frequency:

    the 4
    is 3
    sunny 2
    day 1

    -----------------  

    words.txt

    the day is sunny the the
    the sunny is is
    

    solution1:

    awk '{for(i=1;i<=NF;i++) a[$i]++} END {for(k in a) print k,a[k]}' words.txt | sort -k2 -nr
    

    solution2:

    sed 's/^s+//g; s/s+/ /g; s/s+$//g' words.txt | tr ' ' '
    ' | sort | uniq -c | sort -nr | awk -F' ' '{print $2" "$1}'
    
    1. use sed to strip head & tail spaces,and change inline spaces to one space
    2. use tr to trans space to return (these two steps also can be done cat words.txt | tr -s ' ' ' ')
    3. sort the words
    4. uniq to count words
    5. sort the stats result,-n for numeric sort,-r for reverse
    6. use awk to format the output

      

    参考:leetcode  

      

      

      

      

  • 相关阅读:
    Centos6.8通过yum安装mysql5.7
    查看mysql已安装
    canal client leader
    es按时间段统计总数
    nginx负载
    es 查看mapping 设置max_result_window
    es 修改默认bool条件个数
    less
    Less配置环境
    JavaScript面向对象与原型
  • 原文地址:https://www.cnblogs.com/carsonzhu/p/5746396.html
Copyright © 2011-2022 走看看