精华内容
下载资源
问答
  • R语言笔记一

    万次阅读 多人点赞 2016-06-19 21:44:10
    语法格式 substr(x, start, stop) substring(text, first, last = 1000000L) 第 1个参数均为要拆分的字串向量,第2个参数为截取的起始位置向量,第3个参数为截取字串的终止位置向量 substr返回的字串个数等于第一个...

    常用函数

    object.size() ##查询数据大小
    names() ##查询数据变量名称
    head(x, 10) ,tail(x, 10) ##查询数据前/后10行
    summary() ##对数据集的详细统计呈现
    table(x$y) ##对y值出现次数统计
    str() ##查询数据集/函数的详细结构
    nrow(),ncol() ##查询行列数
    sqrt(x) ##square root取x的平方根
    abs(x) ##absolute value取x的绝对值
    names(vect2)<-c(“foo”,”bar”,”norf”) ##给向量命名
    identical(vect,vect2) ##TRUE 检查两个向量是否一样
    vect[c(“foo”,”bar”)] ##用名字选取向量
    colnames(my_data)<-cnames ##修改数据框的列名
    t() ##互换数据框的行列
    length(“”)统计字符数,空字符时计数为1
    nchar(“”)统计字符数,空字符时计数为0
    tolower()将字符转换为小写
    toupper()将字符转换为大写
    chartr(“A”,”B”,x):字符串x中使用B替换A
    na.omit(),移除所有含有缺失值的观测(行删除,listwise deletion)
    paste()

    paste("Var",1:5,sep="")
    [1] "Var1" "Var2" "Var3" "Var4" "Var5"
    
    > x<-list(a='aaa',b='bbb',c="ccc")
    > y<-list(d="163.com",e="qq.com")
    > paste(x,y,sep="@")
    [1] "aaa@163.com" "bbb@qq.com"  "ccc@163.com"
    
    #增加collapse参数,设置分隔符
    > paste(x,y,sep="@",collapse=';')
    [1] "aaa@163.com;bbb@qq.com;ccc@163.com"
    > paste(x,collapse=';')
    [1] "aaa;bbb;ccc"
    

    strsplit()字符串拆分

    strsplit(x, split, fixed = FALSE, perl = FALSE, useBytes = FALSE)
    x为需要拆分的字串向量
    split为拆分位置的字串向量,默认为正则表达式匹配(fixed=FALSE),
    设置fixed=TRUE,表示使用普通文本匹配或正则表达式的精确匹配。普通文本的运算速度快
    perl=TRUE/FALSE的设置和perl语言版本有关,如果正则表达式很长,正确设置表达式并且使用perl=TRUE可以提高运算速度。
    useBytes设置是否逐个字节进行匹配,默认为FALSE,即按字符而不是字节进行匹配。
    strsplit得到的结果是列表,后面要怎么处理就得看情况而定了
    

    字符串替换:sub(),gsub()

    严格地说R语言没有字符串替换的函数
    R语言对参数都是传值不传址
    sub和gsub的区别是前者只做一次替换,gsub把满足条件的匹配都做替换
    > text<-c("Hello, Adam","Hi,Adam!","How are you,Ava")
    > sub(pattern="Adam",replacement="word",text)
    [1] "Hello, word"     "Hi,word!"        "How are you,Ava"
    > sub(pattern="Adam|Ava",replacement="word",text)
    [1] "Hello, word"      "Hi,word!"         "How are you,word"
    > gsub(pattern="Adam|Ava",replacement="word",text)
    [1] "Hello, word"      "Hi,word!"         "How are you,word"
    

    字符串提取substr(), substring()

    substr和substring函数通过位置进行字符串拆分或提取,它们本身并不使用正则表达式
    结合正则表达式函数regexpr、gregexpr或regexec使用可以非常方便地从大量文本中提取所需信息
    语法格式
    substr(x, start, stop) 
    substring(text, first, last = 1000000L)
    第 1个参数均为要拆分的字串向量,第2个参数为截取的起始位置向量,第3个参数为截取字串的终止位置向量
    substr返回的字串个数等于第一个参数的长度
    substring返回字串个数等于三个参数中最长向量长度,短向量循环使用
    > x <- "123456789" 
    > substr(x, c(2,4), c(4,5,8)) 
    [1] "234" 
    > substring(x, c(2,4), c(4,5,8)) 
    [1] "234"     "45"      "2345678"
    因为x的向量长度为1,substr获得的结果只有1个字串,
    即第2和第3个参数向量只用了第一个组合:起始位置2,终止位置4。
    substring的语句三个参数中最长的向量为c(4,5,8),执行时按短向量循环使用的规则第一个参数事实上就是c(x,x,x),
    第二个参数就成了c(2,4,2),最终截取的字串起始位置组合为:2-4, 4-5和2-8。
    

    Workspace and Files

    ls() ##查询工作区对象
    list.files(), dir() ##列出工作目录所有文件
    dir.create(“testdir”) ##创建testdir目录
    file.create(“mytest.R”) ##创建mytest.R文件
    file.exists(“mytest.R”) ##查询文件是否存在
    file.info(“mytest.R”) , file.info(“mytest.R”)$mode ##查询文件包含信息,或特定信息
    file.rename(“mytest.R”,”mytest2.R”) ##重命名为mytest2.R
    file.remove(“mytest.R”) ##删文件
    file.copy(“mytest2.R”,”mytest3.R”) ##复制为mytest3.R文件
    file.path(“mytest3.R”) ##在众多工作文件中,指定提供某个文件的相对路径。
    file.path(“folder1”,”folder2”) ##”folder1/folder2”也能创建独立于系统的路径供R工作。?

    Create a directory in the current working directory called “testdir2” and a subdirectory for it called “testdir3”, all in one command by using dir.create() and file.path().

     dir.create(file.path('testdir2','testdir3'),recursive = TRUE)
    
     unlink("testdir2", recursive = TRUE)    ##删除目录及所有(没有recursive=T,R会阻止)。名称源于unix命令。
    setwd('testdir')     ##设testdir目录,为工作目录
    > old.dir <- getwd()
    args()  ##查询函数参数构成
    sample(x) ##也可以对x重新排序
    > sample(1:6, 4, replace = TRUE)
    [1] 4 5 1 3
    
    >flips <- sample(c(0,1),100,replace = TRUE, prob = c(0.3,0.7)) #prob设定0和1出现的概率
    
    > flips
      [1] 1 1 1 1 1 1 1 0 1 1 1 0 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1
     [47] 1 0 1 1 1 1 1 0 1 0 1 1 1 1 0 0 1 1 1 1 0 1 1 1 1 0 0 0 0 0 1 0 0 1 1 0 1 1 1 1 1 0 0 1 1 1
     [93] 1 1 1 1 1 0 1 1
    

    Sequence of Numbers

    > 1:10
     [1]  1  2  3  4  5  6  7  8  9 10
    
    >pi:10   ##real numbers 实数
    [1] 3.141593 4.141593 5.141593 6.141593 7.141593 8.141593 9.141593
    

    ?‘:’查询操作符号:

    > seq(1,10)
     [1]  1  2  3  4  5  6  7  8  9 10
    
    > seq(0, 10, by=0.5)
     [1]  0.0  0.5  1.0  1.5  2.0  2.5  3.0  3.5  4.0  4.5  5.0  5.5  6.0  6.5  7.0  7.5  8.0  8.5
    [19]  9.0  9.5 10.0
    
    >my_seq<- seq(5,10,length=30)  ##在区间(5, 10)等距生成30个数
    > 1:length(my_seq)
     [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
    > seq(along.with = my_seq)
     [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
    
    > seq_along(my_seq)  **
     [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
    
    >rep(c(0,1,2),times=10)
     [1] 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2
    >rep(c(0,1,2),each=10)
     [1] 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2
    

    Vector

    > paste(1:3,c("X", "Y", "Z"),sep="")
    [1] "1X" "2Y" "3Z"
    

    * Vector recycling!*

    > paste(LETTERS, 1:4, sep = "-")
     [1] "A-1" "B-2" "C-3" "D-4" "E-1" "F-2" "G-3" "H-4" "I-1" "J-2" "K-3" "L-4" "M-1" "N-2" "O-3"
    [16] "P-4" "Q-1" "R-2" "S-3" "T-4" "U-1" "V-2" "W-3" "X-4" "Y-1" "Z-2"
    

    数据类型

    对象与属性 Objects and Attributes

    Objects

    R has five basic or “atomic” classes of objects:

    • character
    • numeric (real numbers)
    • integer
    • complex
    • logical (True/False)

    The most basic object is a vector

    • A vector can only contain objects of the same class
    • BUT: The one exception is a list, which is represented as a vector but can contain objects of different classes (indeed, that’s usually why we use them)

    Empty vectors can be created with the vector() function.

    Numbers

    • Numbers in R a generally treated as numeric objects (i.e. double precision real numbers)
    • If you explicitly want an integer, you need to specify the L suffix
    • Ex: Entering *1* gives you a numeric object; entering *1L* explicitly gives you an integer **
    • There is also a special number *Inf* which represents infinity; e.g. 1 / 0; Inf can be used in ordinary calculations; e.g. 1 / Inf is 0
    • The value *NaN* represents an undefined value (“not a number”); e.g. 0 / 0; *NaN* can also be thought of as a missing value (more on that later)

    Attributes

    R objects can have attributes

    • names, dimnames
    • dimensions (e.g. matrices, arrays)
    • class
    • length
    • other user-defined attributes/metadata
      Attributes of an object can be accessed using the attributes() function

    向量与列表 Vectors and Lists

    Creating Vectors

    The c() function can be used to create vectors of objects.

    > x <- c(0.5, 0.6) ## numeric
    > x <- c(TRUE, FALSE) ## logical
    > x <- c(T, F) ## logical
    > x <- c("a", "b", "c") ## character
    > x <- 9:29 ## integer
    > x <- c(1+0i, 2+4i) ## complex
    

    Using the vector() function

    > x <- vector("numeric", length = 10)
    > x
     [1] 0 0 0 0 0 0 0 0 0 0
    

    Mixing Objects

    When different objects are mixed in a vector, coercion occurs so that every element in the vector is of the same class.

    > y <- c(1.7, "a") ## character
    > y <- c(TRUE, 2) ## numeric
    > y <- c("a", TRUE) ## character
    

    Explicit Coercion 强制明确

    Objects can be explicitly coerced from one class to another using the as.* functions, if available.

    > x <- 0:6
    > class(x)
    [1] "integer"
    > as.numeric(x)
    [1] 0 1 2 3 4 5 6
    > as.logical(x)
    [1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE
    > as.character(x)
    [1] "0" "1" "2" "3" "4" "5" "6"
    

    Nonsensical coercion results in NAs

    > x <- c("a", "b", "c")
    > as.numeric(x)
    [1] NA NA NA
    Warning message:
    NAs introduced by coercion
    > as.logical(x)
    [1] NA NA NA
    > as.complex(x)
    [1] NA NA NA
    Warning message:
    NAs introduced by coercion 
    

    Lists

    Lists are a special type of vector that can contain elements of different classes. Lists are a very important data type in R and you should get to know them well.

    > x <- list(1, "a", TRUE, 1 + 4i)
    > x
    [[1]]
    [1] 1
    [[2]]
    [1] "a"
    [[3]]
    [1] TRUE
    [[4]]
    [1] 1+4i
    

    矩阵 Matrices

    Matrices

    Matrices are vectors with a dimension attribute. The dimension attribute is itself an integer vector of length 2 (nrow, ncol)

    > m <- matrix(nrow = 2, ncol = 3)
    > m
     [,1] [,2] [,3]
    [1,] NA NA NA
    [2,] NA NA NA
    > dim(m)
    [1] 2 3
    > attributes(m) **
    $dim
    [1] 2 3
    

    Matrices (cont’d)

    Matrices are constructed column-wise, so entries can be thought of starting in the “upper left” corner and running down the columns.

    > m <- matrix(1:6, nrow = 2, ncol = 3)
    > m
     [,1] [,2] [,3]
    [1,] 1 3 5
    [2,] 2 4 6
    

    Matrices can also be created directly from vectors by adding a dimension attribute.**

    > m <- 1:10
    > m
    [1] 1 2 3 4 5 6 7 8 9 10
    > dim(m) <- c(2, 5)  **
    > m
     [,1] [,2] [,3] [,4] [,5]
    [1,] 1 3 5 7 9
    [2,] 2 4 6 8 10
    

    cbind-ing and rbind-ing

    Matrices can be created by column-binding or row-binding with cbind() and rbind().

    > x <- 1:3
    > y <- 10:12
    > cbind(x, y)
     x y
    [1,] 1 10
    [2,] 2 11
    [3,] 3 12
    > rbind(x, y)
     [,1] [,2] [,3]
    x 1 2 3
    y 10 11 12
    

    因子 Factors

    Factors are used to represent categorical data. Factors can be unordered or ordered. One can think of a factor as an integer vector where each integer has a label.

    • Factors are treated specially by modelling functions like *lm()* and *glm()*
    • Using factors with labels is *better* than using integers because factors are self-describing; having a variable that has values “Male” and “Female” is better than a variable that has values 1 and 2.

       x <- factor(c("yes", "yes", "no", "yes", "no"))
       x
      [1] yes yes no yes no
      Levels: no yes
       table(x)
      x
      no yes
      2 3
       unclass(x)
      [1] 2 2 1 2 1
      attr(,"levels")
      [1] "no" "yes"
      

    The order of the levels can be set using the levels argument to factor(). This can be important in linear modelling because the first level is used as the baseline level.

    > x <- factor(c("yes", "yes", "no", "yes", "no"),
     levels = c("yes", "no")) **
    > x
    [1] yes yes no yes no
    Levels: yes no
    

    缺失值 Missing Values

    Missing values are denoted by NA or NaN for undefined mathematical operations.

    • is.na() is used to test objects if they are NA
    • is.nan() is used to test for NaN
    • NA values have a class also, so there are integer NA, character NA, etc
    • A NaN value is also NA but the converse is not true

      > x <- c(1, 2, NA, 10, 3)
      > is.na(x)
      [1] FALSE FALSE TRUE FALSE FALSE
      > is.nan(x)
      [1] FALSE FALSE FALSE FALSE FALSE
      > x <- c(1, 2, NaN, NA, 4)
      > is.na(x)
      [1] FALSE FALSE TRUE TRUE FALSE
      > is.nan(x)
      [1] FALSE FALSE TRUE FALSE FALSE
      

    数据框 Data Frames

    Data frames are used to store tabular data (表格数据)

    • They are represented as a special type of list where every element of the list has to have the same length
    • Each element of the list can be thought of as a column and the length of each element of the list is the number of rows
    • Unlike matrices, data frames can store different classes of objects in each column (just like lists); matrices must have every element be the same class
    • Data frames also have a special attribute called *row.names*
    • Data frames are usually created by calling *read.table()* or *read.csv()*
    • Can be converted to a matrix by calling *data.matrix()* *

      > x <- data.frame(foo = 1:4, bar = c(T, T, F, F))
      > x
       foo bar
      1 1 TRUE
      2 2 TRUE
      3 3 FALSE
      4 4 FALSE
      > nrow(x)
      [1] 4
      > ncol(x)
      [1] 2
      

    Names Attribute 名字属性

    Names

    R objects can also have names, which is very useful for writing readable code and self-describing objects.

    > x <- 1:3
    > names(x)
    NULL
    > names(x) <- c("foo", "bar", "norf")
    > x
    foo bar norf
     1 2 3
    > names(x)
    [1] "foo" "bar" "norf"
    

    Lists can also have names.

    > x <- list(a = 1, b = 2, c = 3)
    > x
    $a
    [1] 1
    $b
    [1] 2
    $c
    [1] 3
    

    And matrices.

    > m <- matrix(1:4, nrow = 2, ncol = 2)
    > dimnames(m) <- list(c("a", "b"), c("c", "d")) ***
    > m
     c d
    a 1 3
    b 2 4
    

    Summary

    Data Types

    • atomic classes: numeric, logical, character, integer, complex \
    • vectors, lists
    • factors
    • missing values
    • data frames
    • names

    Reading Writing Data

    Reading Data

    There are a few principal functions reading data into R.

    • *read.table()*, *read.csv()*, for reading tabular data
    • *readLines()*, for reading lines of a text file
    • *source()*, for reading in R code files (inverse of dump)**
    • *dget()*, for reading in R code files (inverse of dput)**
    • *load()*, for reading in saved workspaces
    • *unserialize()*, for reading single R objects in binary form

    Writing Data

    There are analogous functions for writing data to files.

    • write.table()
    • writeLines()
    • dump()
    • dput()
    • save()
    • serialize()

    Reading Data Files with read.table *

    The read.table function is one of the most commonly used functions for reading data. It has a few important arguments:

    • *file*, the name of a file, or a connection
    • *header*, logical indicating if the file has a header line
    • *sep*, a string indicating how the columns are separated
    • *colClasses*, a character vector indicating the class of each column in the dataset
    • *nrows*, the number of rows in the dataset
    • *comment.char()*, a character string indicating the comment character
    • *skip*, the number of lines to skip from the beginning
    • *stringsAsFactors*, should character variables be coded as factors?

    read.table
    For small to moderately sized datasets, you can usually call read.table without specifying any other arguments.

    data <- read.table("foo.txt")

    R will automatically

    • skip lines that begin with a #
    • figure out how many rows there are (and how much memory needs to be allocated
    • figure what type of variable is in each column of the table Telling R all these things directly makes R run faster and more efficiently.
    • *read.csv* is identical to *read.table* except that the default separator is a comma.

    Reading in Larger Datasets with read.table

    With much larger datasets, doing the following things will make your life easier and will prevent R from choking.

    • Read the help page for read.table, which contains many hints
    • Make a rough calculation of the memory required to store your dataset. If the dataset is larger than the amount of RAM on your computer, you can probably stop right here.
    • Set comment.char = "" if there are no commented lines in your file. **
    • Use the *colClasses* argument. Specifying this option instead of using the default can make ’read.table’ run MUCH faster, often twice as fast. In order to use this option, you have to know the class of each column in your data frame. If all of the columns are “numeric”, for example, then you can just set *colClasses = "numeric"*. A quick an dirty way to figure out the classes of each column is the following:
    initial <- read.table("datatable.txt", nrows = 100) ***
    classes <- sapply(initial, class)
    tabAll <- read.table("datatable.txt",
                          colClasses = classes)
    • Set *nrows*. This doesn’t make R run faster but it helps with memory usage. A mild overestimate is okay. You can use the Unix tool *wc* to calculate the number of lines in a file.

    Know Thy System

    In general, when using R with larger datasets, it’s useful to know a few things about your system.

    • How much memory is available?
    • What other applications are in use?
    • Are there other users logged into the same system?
    • What operating system?
    • Is the OS 32 or 64 bit?

    Calculating Memory Requirements

    I have a data frame with 1,500,000 rows and 120 columns, all of which are numeric data. Roughly, how much memory is required to store this data frame?
    1,500,000 × 120 × 8 bytes/numeric

    = 1440000000 bytes
    = 1440000000 / bytes/MB
    = 1,373.29 MB
    = 1.34 GB

    Textual Formats

    • *dumping* and *dputing* are useful because the resulting textual format is edit-able, and in the case of corruption, potentially recoverable.
    • *Unlike* writing out a table or csv file, *dump* and *dput* preserve the *metadata* (sacrificing some readability), so that another user doesn’t have to specify it all over again.
    • *Textual* formats can work much better with version control programs like subversion or git which can only track changes meaningfully in text files
    • Textual formats can be longer-lived; if there is corruption somewhere in the file, it can be easier to fix the problem
    • Textual formats adhere to the “Unix philosophy”
    • Downside: The format is not very space-efficient

    dput-ting R Objects ?

    Another way to pass data around is by deparsing the R object with dput and reading it back in using dget.

    > y <- data.frame(a = 1, b = "a")
    > dput(y)
    structure(list(a = 1,
                     b = structure(1L, .Label = "a",
                                            class = "factor")),
                .Names = c("a", "b"), row.names = c(NA, -1L),
                class = "data.frame")
    > dput(y, file = "y.R")
    > new.y <- dget("y.R")
    > new.y
         a    b
    1   1    a
    

    Dumping R Objects ?

    Multiple objects can be deparsed(逆分析) using the dump function(转储功能) and read back in using source.

    > x <- "foo"
    > y <- data.frame(a = 1, b = "a")
    > dump(c("x", "y"), file = "data.R")
    > rm(x, y)
    > source("data.R")
    > y
        a  b
    1  1  a
    > x
    [1] "foo"
    

    Interfaces to the Outside World

    Data are read in using connection interfaces. Connections can be made to files (most common) or to other more exotic things.

    • *file*, opens a connection to a file
    • *gzfile*, opens a connection to a file compressed with gzip
    • *bzfile*, opens a connection to a file compressed with bzip2
    • *url*, opens a connection to a webpage

    File Connections **

    > str(file)
    function (description = "", open = "", blocking = TRUE,
                encoding = getOption("encoding"))
    
     1. *description* is the name of the file
     2. *open* is a code indicating
        - “r” read only
        - “w” writing (and initializing a new file)
        - “a” appending
        - “rb”, “wb”, “ab” reading, writing, or appending in binary mode (Windows)
    

    Connections

    In general, connections are powerful tools that let you navigate files or other external objects. In practice, we often don’t need to deal with the connection interface directly.

    con <- file("foo.txt", "r") **
    data <- read.csv(con)
    close(con)

    is the same as

    data <- read.csv("foo.txt")

    Reading Lines of a Text File

    > con <- gzfile("words.gz")
    > x <- readLines(con, 10)
    > x
     [1] "1080"        "10-point"   "10th"         "11-point"
     [5] "12-point"  "16-point"   "18-point"  "1st"
     [9] "2"              "20-point"
    

    writeLines takes a character vector and writes each element one line at a time to a text file.
    readLines can be useful for reading in lines of webpages

    ## This might take time
    con <- url("http://www.jhsph.edu", "r")
    x <- readLines(con)
    > head(x)
    [1] "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\">"
    [2] ""
    [3] "<html>"
    [4] "<head>"
    [5] "\t<meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8

    Subsetting

    There are a number of operators that can be used to extract subsets of R objects.

    • [ always returns an object of the same class as the original; can be used to select more than one element (there is one exception)
    • [[ is used to extract elements of a list or a data frame; it can only be used to extract a single element and the class of the returned object will not necessarily be a list or data frame
    • $ is used to extract elements of a list or data frame by name; semantics are similar to that of [[.

      x <- c(“a”, “b”, “c”, “c”, “d”, “a”)
      x[1]
      [1] “a”
      x[2]
      [1] “b”
      x[1:4]
      [1] “a” “b” “c” “c”
      x[x > “a”]
      [1] “b” “c” “c” “d”
      u <- x > “a”
      u
      [1] FALSE TRUE TRUE TRUE TRUE FALSE
      x[u]
      [1] “b” “c” “c” “d”

    Subsetting Lists

    > x <- list(foo = 1:4, bar = 0.6)
    > x[1]
    $foo
    [1] 1 2 3 4
    > x[[1]]
    [1] 1 2 3 4
    > x$bar
    [1] 0.6
    > x[["bar"]]
    [1] 0.6
    > x["bar"]
    $bar
    [1] 0.6
    
    > x <- list(foo = 1:4, bar = 0.6, baz = "hello")
    > x[c(1, 3)]
    $foo
    [1] 1 2 3 4
    $baz
    [1] "hello"
    

    The [[ operator can be used with computed indices; $ can only be used with literal names.

    > x <- list(foo = 1:4, bar = 0.6, baz = "hello")
    > name <- "foo"
    > x[[name]]     ## computed index for ‘foo’ **
    [1] 1 2 3 4
    > x$name       ## element ‘name’ doesn’t exist!
    NULL
    > x$foo
    [1] 1 2 3 4       ## element ‘foo’ does exist
    

    Subsetting Nested Elements of a List
    The [[ can take an integer sequence.

    > x <- list(a = list(10, 12, 14), b = c(3.14, 2.81))
    > x[[c(1, 3)]]    **
    [1] 14
    > x[[1]][[3]]
    [1] 14
    > x[[c(2, 1)]]
    [1] 3.14
    

    Subsetting a Matrix

    Matrices can be subsetted in the usual way with (i,j) type indices.

    > x <- matrix(1:6, 2, 3)
    > x[1, 2]
    [1] 3
    > x[2, 1]
    [1] 2
    

    Indices can also be missing. **

    > x[1, ]
    [1] 1 3 5
    > x[, 2]
    [1] 3 4
    

    By default, when a single element of a matrix is retrieved, it is returned as a vector of length 1 rather than a 1 × 1 matrix. This behavior can be turned off by setting drop = FALSE.

    > x <- matrix(1:6, 2, 3)
    > x[1, 2]
    [1] 3
    > x[1, 2, drop = FALSE] **
        [,1] 
    [1,]   3
    

    Similarly, subsetting a single column or a single row will give you a vector, not a matrix (by default).

    > x <- matrix(1:6, 2, 3)
    > x[1, ]
    [1] 1 3 5
    > x[1, , drop = FALSE]
      [,1]    [,2]    [,3]
    [1,]   1       3       5
    

    Partial Matching

    Partial matching of names is allowed with [[ and $

    > x <- list(aardvark = 1:5)
    > x$a
    [1] 1 2 3 4 5
    > x[["a"]]
    NULL
    > x[["a", exact = FALSE]] ***
    [1] 1 2 3 4 5 
    

    Removing NA Values *

    A common task is to remove missing values (NAs).

    > x <- c(1, 2, NA, 4, NA, 5)
    > bad <- is.na(x)
    > x[!bad]
    [1] 1 2 4 5
    

    What if there are multiple things and you want to take the subset with no missing values?

    > x <- c(1, 2, NA, 4, NA, 5)
    > y <- c("a", "b", NA, "d", NA, "f")
    > good <- complete.cases(x, y) ***
    > good
    [1] TRUE TRUE FALSE TRUE FALSE TRUE
    > x[good]
    [1] 1 2 4 5
    > y[good]
    [1] "a" "b" "d" "f"
    
    > airquality[1:6, ]
          Ozone     Solar.R    Wind       Temp     Month   Day
    1       41       190       7.4         67       5       1
    2       36       118       8.0         72       5       2
    3       12       149       12.6       74       5       3
    4       18       313       11.5       62       5       4
    5       NA       NA       14.3       56       5       5
    6       28       NA 14.9 66 5 6
    > good <- complete.cases(airquality)
    > airquality[good, ] [1:6, ]   ***
             Ozone Solar.R   Wind      Temp       Month     Day
    1       41       190       7.4         67       5       1
    2       36       118       8.0         72       5       2
    3       12       149       12.6       74       5       3
    4       18       313       11.5       62       5       4
    7       23       299       8.6         65       5       7
    

    Vectorized Operations 向量化操作

    Many operations in R are vectorized making code more efficient, concise, and easier to read.

    > x <- 1:4; y <- 6:9
    > x + y
    [1] 7 9 11 13
    > x > 2
    [1] FALSE FALSE TRUE TRUE
    > x >= 2
    [1] FALSE TRUE TRUE TRUE
    > y == 8
    [1] FALSE FALSE TRUE FALSE
    > x * y
    [1] 6 14 24 36
    > x / y
    [1] 0.1666667 0.2857143 0.3750000 0.4444444
    

    Vectorized Matrix Operations

    > x <- matrix(1:4, 2, 2); y <- matrix(rep(10, 4), 2, 2) ?
    > x * y             ## element-wise multiplication
            [,1]    [,2]
    [1,]    10    30
    [2,]    20    40
    > x / y
         [,1]    [,2]
    [1,]    0.1    0.3
    [2,]    0.2    0.4
    > x %*% y       ## true matrix multiplication
              [,1]    [,2]
    [1,]      40    40
    [2,]      60    60
    

    Missing Value

    is.na(mydata) 与 mydata == NA 结果一样

    R uses ‘one-based indexing‘, which (you
    | guessed it!) means the first element of a vector is considered element 1.

    x[c(2, 10)] ##取x的第2个和第10个数
    x[c(-2, -10)] ##取除去第2个和第10个的所有数
    x[-c(2, 10)] ##同上

    展开全文
  • 论文名称:A Primer in BERTology: What we know about how BERT works 作者: Anna Rogers 发表时间:2020/2/7 论文链接:https://arxiv.org/pdf/2002.12327.pdf 摘要 基于变压器的模型现在已在NLP中广泛使用,...
    论文名称:A Primer in BERTology: What we know about how BERT works
    作者: Anna Rogers
    发表时间:2020/2/7
    论文链接:https://arxiv.org/pdf/2002.12327.pdf
    

    摘要

    基于变压器的模型现在已在NLP中广泛使用,但是我们仍然对其内部运作还不甚了解。 本文描述了迄今为止著名的BERT模型(Devlin等人,2019)的已知情况,综合了40多项分析研究。 我们还提供了对模型及其训练方案的拟议修改的概述。 然后,我们概述了进一步研究的方向。

    1.Introduction

    自2017年推出以来,Transformer(Vaswani等人,2017)席卷了NLP,提供了增强的并行化和更好的远程依赖模型化。 最著名的基于Transformer的模型是BERT(Devlin et al。,2019),它在众多基准测试中获得了最新的结果,并已集成Google search中,估计可以改善10%的查询

    虽然很明显,基于BERT和其他基于Transformer的模型可以很好地工作,但是为什么不清楚,这限制了由假设驱动的体系结构的进一步改进。与CNN不同,“Transformer”的认知动机很小,而且这些模型的规模限制了我们进行预训练和执行消融研究的能力。 这解释了过去一年的大量研究,试图理解BERT表现背后的原因。
    本文概述了迄今为止所学的知识,重点介绍了仍未解决的问题。 我们专注于研究BERT学习的知识的类型,学习方法以及改进它的方法。

    2.Overview of BERT architecture

    从根本上讲,BERT是一堆Transformer encoder层(Vaswani等人,2017),由多个“头部”组成,即完全连接的神经网络,增强了自我注意机制。 对于序列中的每个输入token ,每个head都会计算key, value and query vectors,这些向量用于创建加权表示。 同一层中所有head的输出被组合并经过一个完全连接的层。 BERT的每个工作流程都包含一个跳过连接,然后对其进行标准化。
    BERT的常规工作流程包括两个阶段:预训练和微调。 预训练使用两个半监督任务:掩蔽语言建模(MLM,对随机掩蔽的输入标记的预测)和下一句预测(NSP,预测两个输入句子是否彼此相邻)。 在对下游应用进行微调时,通常在最终编码器层的顶部添加一个或多个全连接的层。
    输入层的表达计算如下:
    BERT首先将给定的单词标记成单词(Wu等,2016b),然后组合三个嵌入层(token, position, and seg-ment)以获得固定长度的向量。 特殊的token [CLS]用于分类预测,[SEP]分离输入段。原始的BERT有两个版本:基础版本和大型版本,层数,隐藏大小和注意头数各不相同。
    在这里插入图片描述

    3.BERT embeddings

    与传统的静态嵌入不同(Mikolov等人,2013a; Pennington等人,2014),BERT的表示是上下文化的,即每个输入token都由依赖于特定出现上下文的向量表示。在当前对BERT表示空间的研究中 ,术语“嵌入”是指给定(通常是最终的)Transformerlayer的输出向量。
    Wiedemann等(2019)发现BERT的上下文嵌入形成了与词义相对应的清晰明了的簇,这证实了这些表示的基本分布假设成立。 然而,Mickus等人(2019)指出,相同单词的representations 可能取决于NSP的目标,取决于其出现在句子中的位置。
    Ethayarajh(2019)测量了相同单词的嵌入在每一层中的相似程度,发现后来的BERT层产生了更多特定于上下文的表示形式。 他们还发现BERT embeddings在向量空间中占据了一个狭窄的圆锥体,并且这种影响从低层到高层逐渐增加。 也就是说,如果嵌入方向一致(各向同性),则两个随机词的余弦相似度将比预期的高得多。

    4. What knowledge does BERT have?

    许多研究已经研究了BERT权重编码的知识类型。 流行的方法包括对BERT的MLM进行间隙填充探查,自我注意权重分析以及使用不同的BERT表示作为输入的探测分类器。

    4.1Syntactic knowledge句法知识

    Lin等(2019)表明,BERT表示是分层的而不是线性的,即除了单词顺序信息外,还存在类似于语法树结构的东西。 Tenney等(2019b)和Liu等(2019a)还显示,BERT嵌入对有关词性,句法块和角色的信息进行编码。 但是,BERT的语法知识是不完整的,因为探测分类器无法恢复句法树中遥远的父节点的标签(Liu等人,2019a)。

    就如何重新表达语法信息而言,似乎语法结构并不是直接在自我注意权重中编码的,而是可以转化为反映它的。 Htut et al(2019)也无法从BERT头中提取完整的解析树,even with the gold annotations for the root。 Jawahar等 (2019)包括从自注意权重中直接提取的依赖树的简要说明,但未提供定量评估。 但是,休伊特(Hewitt)和曼宁(Manning)(2019)能够学习转换矩阵,该矩阵可以成功地恢复PennTreebankdata的大部分斯坦福依赖关系形式主义(见图2) Jawahar等(2019)尝试使用Tensor Product Decomposition Networks(McCoy et al,2019a)近似BERT表示,得出的结论是依赖树是5种分解方案中的最佳匹配项(尽管报告的MSE差异很小)。
    在这里插入图片描述
    关于BERT的MLM的句法能力,Goldberg(2019)表明BERT在执行完形填空任务时考虑了主语-谓语协议。即使在主语和动词之间有干扰词的句子和无意义的句子也是如此。 Warstadtet等人对负极性项目(NPI)的研究表明,BERT能够更好地检测到NPI(例如“ ever”)的存在和允许使用它们的词(例如“是否”),而不是违反范围。

    上述句法知识的证据是基于这样的事实,即BERT不会“理解”否定并且对格式错误的输入不敏感。特别是,即使单词顺序混乱,句子被删节,主题和宾语移走,其预测也不会改变( Ettinger,2019)。 这与最近关于对抗性攻击的发现相符,模型受到了无意义的输入的干扰(Wallace et al。,2019a),并表明BERT对句法结构的编码并不表明它实际上依赖于该知识,

    4.2Semantic knowledge语义知识

    迄今为止,更多的研究致力于BERT的句法知识而不是语义现象。 但是,我们确实从MLM探索性研究中获得证据,证明BERT具有一些语义知识(Ettinger,2019)。 BERT甚至能够将与正确的人语义相关的语义角色的不正确填充者偏爱不相关的角色(例如,“to tip a chef”should be better than “to tip a robin”, but worsethan “to tip a waiter” )

    Tenney等人(2019b)表明,BERT可以对有关实体类型,关系,语义角色和原型的信息进行编码,因为该信息可以通过探测分类器进行检测。

    BERT难以解决数字的表示问题。加法和数字解码任务表明 BERT不能很好地表示浮点数,也无法从训练数据中泛化(Wallace等人,2019b)。 另一个问题是BERT的单词标记化,因为相似值的数量可以划分为实质上不同的单词块。

    4.3World knowledge生活常识

    BERT的MLM组件很容易通过填入空格来适应归纳知识(例如“Cats like to chase [ _ _ _ ]”)。至少有一个证明Bert中常识的研究,但是大部分的证据都来自使用Bert提取常识的实践中。
    在这里插入图片描述
    Petroni等(2019)研究表明,对于某些翻译类型,vanilla BERT与基于知识库的方法相比具有竞争力(图3)。Davison等人(2019)认为,它更适用于不可见的数据。然而,为了检索BERT中的knowledge,我们需要好的模板句,并且有关于自动提取和扩充它们的工作。
    但是,BERT无法基于其常识进行推理。 Forbes等人(2019)显示BERT可以“猜测”许多物体的承受能力和属性,但没有有关它们之间相互作用的信息(例如,t “knows” that people can walk into houses, and that houses are big, but it cannot infer that houses are bigger than people)。ZHOU(2020年)以及Richardson和Sabharwal(2019年)也表明,性能随着必要的推理步骤数而下降。 在同一时间,Poerner等人(2019)显示BERT在事实类知识检索中的某些成功源于学习刻板印象,例如 它也可以预测一个具有意大利语名字的人是意大利语,即使实际上是不正确的。

    5. Localizing linguistic knowledge语言知识本地化

    5.1Self-attention heads

    Attention被广泛认为有助于理解Transformer模型,并且提出了一些研究建议对注意头类型进行分类:
    • attending to the word itself, to previous/nextwords and to the end of the sentence (Ra-ganato and Tiedemann, 2018);
    • attending to previous/next tokens,[CLS],[SEP], punctuation, and “attending broadly”over the sequence (Clark et al., 2019);
    • the 5 attention types shown in Figure 4 (Ko-valeva et al., 2019).
    在这里插入图片描述
    据克拉克等(2019),“注意力权重具有明确的含义:当计算预测当前单词的下一个表示形式时,一个特殊单词将被加权多少”。 但是,Kovaleva等人(2019)表明,大多数注意力集中的头部不会直接编码任何非平凡的语言信息,因为只有不到一半的头部具有``heterogeneous异构’'模式。 该模型中的许多模型都对vertical 垂直模式进行了编码(attentionto[CLS],[SEP], and punctuation tokens),这与Clark等人的观察一致。这种明显的冗余必定与过度参数化问题有关(请参阅第7节)。
    Attention to[CLS]很容易解释为对聚合语句级表示的关注,但是BERT也对[SEP]和标点符号给予很大关注 。 克拉克等人假设,句号和逗号几乎与[CLS]和[SEP]一样频繁,并且该模型学会了依赖它们。 他们还建议,[SEP]的功能可能是“无操作”之一,如果它的模式不适用于当前情况,则该信号会忽略头部。[SEP]从第5层开始引起越来越多的关注,但其对预测的有很大的重要性 。 如果此假设正确,则应重新考虑排除[SEP]和[CLS]token的注意力探测研究(例如Lin等人(2019)和Htut等人(2019))。

    一些BERT heads似乎专门研究某些类型的句法关系。Htut等人(2019)和Clark等人报告说,在某些句法位置中,有很多BERT 头 比随机基准词更多地参与了测试。虽然这些研究中使用的数据集和方法不同,但他们俩都发现,有些关注的是角色内的文字,而不是位置基线。 在这两项研究之间,nsubj,advmod和amod的证据有所不同。 总体结论也得到了Voita等人的支持在机器翻译上下文中的基本Transformer数据支持。 胡佛 (2019)假设甚至复杂的依存关系比如的dobj都是由头的组合而不是单个头编码的,但是这项工作仅限于定性分析。

    克拉克等人和Htut等人得出结论,没有一个单头拥有完整的语法树信息,这与部分语法知识提供的证据相符(见4.1小节)。

    注意力的权重是主语-动词一致和反身性假语的较弱指标.BERT的自我注意权重不是在应该关联的标记之间充当强有力的指针,而是接近统一的注意基线,但是对心理语言学数据对不同类型的干扰物具有一定的敏感性。

    Lin等的研究结果表明,注意权重是主语-动词一致和反身性假语的较弱指标。BERT的自我注意权重与其应作为相关标记之间的有力指示,但其接近于统一的注意基线,但对不同的注意点具有一定的敏感性 带有心理语言学数据的干扰物类型。

    Clark、 Kovaleva等人(2019)确定了可以直接用作分类器的BERT头,可以与基于规则的系统相提并论地进行核心干涉解析。

    即使注意头专门跟踪语义关系,他们也不一定有助于BERT在相关任务上的表现。 Koval-eva等。 (Baker et al。,1998)(2019)确定了BERT的两个heads,其中自我注意图与核心框架语义关系的注释紧密对齐。 尽管这种关系本来应该有助于推理等任务,但是头部消融研究表明,这些头部对于BERT在GLUE任务上的成功并非至关重要。

    展开全文
  • 本文目的:读《What every programmer should know about memory》一文,结合之前的经验理解,进行一下小结 参考网址:http://lwn.net/Articles/255364/ 前言 首先该文是针对x86架构来讲的,因此很多地方仅参考...

    本文目的:读《What every programmer should know about memory》一文,结合之前的经验理解,进行一下小结
    参考网址:http://lwn.net/Articles/255364/

    1. 前言
      首先该文是针对x86架构来讲的,因此很多地方仅参考一下即可。
    2. 存储器访问优化的主题
      ①提升局部性(时间、空间)②对齐code和data
      主要方法:
      a. 让数据顺序被访问(比如矩阵乘,先对被乘矩阵做转置T)√
      b. 充分让cacheline读取的数据一次性都进行处理(比如矩阵乘划分到8x8的tile方法)√
      c. 注意结构体的packed,尽量压缩在一个cacheline上
      结构体还有需要注意的地方:常用的变量尽量放在起始位置、尽量按顺序访问结构体变量
      结构体对齐2种方法:memalign和__attribute((aligned(64))),前者在.heap,后者可在.data、.bss或.stack中√
      d. 防止存储区间位置都存储在cache相联的相同位置
    3. L1 指令Cache优化
      a. 尽量降低代码总量,要和循环unroll以及inline达到平衡
      -Os选项在编译器不能很好的unroll和inline时有不错的效果
      对于多次调用的函数,如果距离很近,没有必要inline,因为首先增大代码尺寸导致cache miss,其次分支预测(跳转)对已经见过的代码有很好的支持;但如果函数只调用一次,可以inline(可以用always_inline属性)
      b. 代码执行尽量线性,不要等待什么资源
      c. 代码的align
      需要align:函数开始位置、jump跳转开始位置(、loop开始位置(有可能会造成过多的nop和jump损耗))
      语法:汇编里可以用 .align ,
      d. 如果if() { statement }语句中的条件经常为false,则其statement内容最好单独封装为一个函数,并且不要inline
      可以使用2个宏来调节if()中条件出现的可能性,并配合-freorder-blocks选项
      “#define unlikely(expr) __builtin_expect(!!(expr), 0)”
      “#define likely(expr) __builtin_expect(!!(expr), 1)”
    4. L2 Cache及更大的Cache优化
      尽量让程序全部在cache中
    5. TLB优化
      减少pages的使用,并降低page的级数
    6. 关于预取
      分为硬件预取以及软件预取,其中硬件预取不能跨page
    7. 总结
      最近忘性太大了,都是以前看过的东西,权当复习一遍了。。。
    展开全文
  • 编程语言语法汇总 编程辅导 (PROGRAMMING TUTORIAL) We love to criticize programming languages. 我们喜欢批评编程语言。 We also love to quote Bjarne Stroustrup: 我们也喜欢引用Bjarne Stroustrup的话: ...

    编程语言语法汇总

    编程辅导 (PROGRAMMING TUTORIAL)

    We love to criticize programming languages.

    我们喜欢批评编程语言。

    We also love to quote Bjarne Stroustrup:

    我们也喜欢引用Bjarne Stroustrup的话:

    “There are only two kinds of languages: the ones people complain about and the ones nobody uses.”

    “只有两种语言:人们抱怨的语言和没人使用的语言。”

    So today I decided to flip things around and talk about pieces of syntax that I appreciate in some of the various programming languages I have used.

    因此,今天我决定进行一些讨论,并讨论一些我在使用的各种编程语言中所欣赏的语法。

    This is by no means an objective compilation and is meant to be a fun quick read.

    这绝不是一个客观的汇编,而是一个有趣的快速阅读。

    I must also note that I’m far from proficient in most of these languages. Focusing on that many languages would probably be very counter-productive.

    我还必须指出,我对大多数这些语言都不精通。 专注于许多种语言可能会适得其反。

    Nevertheless, I’ve at least dabbled with all of them. And so, here’s my list:

    不过,我至少已经涉猎了所有这些。 因此,这是我的清单:

    Image for post

    清单理解 (List Comprehension)

    def squares(limit):
    return [num*num for num in range(0, limit)]

    Python syntax has a lot of gems one could pick from, but list comprehension is just something from heaven.

    Python语法有很多宝石可以选择,但是列表理解只是天堂。

    It’s fast, it’s concise, and it’s actually quite readable. Plus it lets you solve Leetcode problems with one-liners. Absolute beauty.

    它快速,简洁,而且实际上可读性强。 此外,它还使您可以使用单线解决Leetcode问题。 绝对美丽。

    Image for post

    点差算子 (Spread Operator)

    let nums1 = [1,2,3]
    let nums2 = [4,5,6]let nums = [...nums1, ...nums2]

    Introduced with ES6, the JavaScript spread operator is just so versatile and clean that it had to be on this list.

    ES6引入了JavaScript传播运算符,它是如此的通用和简洁,因此必须列入此列表。

    Want to concatenate arrays? Check.

    要串联数组吗? 检查一下

    let nums = [...nums1, ...nums2]

    Want to copy/unpack an array? Check.

    是否要复制/解压缩数组? 检查一下

    let nums = [...nums1]

    Want to append multiple items? Check.

    要附加多个项目? 检查一下

    nums = [...nums, 6, 7, 8, 9, 10]

    And there are many other uses for it that I won’t mention here.

    而且还有许多其他用途,我在这里不会提及。

    In short, it’s neat and useful, so that earns my JS syntax prize.

    简而言之,它既简洁又有用,因此赢得了我的JS语法奖。

    Image for post

    Goroutines (Goroutines)

    go doSomething()

    Goroutines are lightweight threads in Go. And to create one, all you need to do is add go in front of a function call.

    Goroutines是Go中的轻量级线程。 而要创建一个,只需在函数调用之前添加go

    I feel like concurrency has never been so simple.

    我觉得并发从未如此简单。

    Here’s a quick example for those not familiar with it. The following snippet:

    对于那些不熟悉它的人,这是一个简单的例子。 以下代码段:

    fmt.Print("Hello")
    go func() {
    doSomethingSlow()
    fmt.Print("world!")
    }()
    fmt.Print(" there ")

    Prints:

    印刷品:

    Hello there world!

    By adding go in front of the call to the closure (anonymous function) we make sure that it it is non-blocking.

    通过在闭包(匿名函数)的调用前添加go ,我们可以确保它是非阻塞的。

    Very cool stuff indeed!

    确实很酷的东西!

    Image for post

    案例与下划线的冷漠 (Case & Underscore Indifference)

    proc my_func(s: string) =
    echo smyFunc("hello")

    Nim is, according to their website, a statically typed compiled systems programming language. And, according to me, you probably never heard of it.

    根据他们的网站 ,Nim是一种静态类型的编译系统编程语言。 而且,据我看来,您可能从未听说过。

    If you haven’t heard of Nim, I encourage you to check it out, because it’s actually a really cool language. In fact, some people even claim it could work well as Python substitute.

    如果您还没有听说过Nim,我鼓励您检查一下它,因为它实际上是一种非常酷的语言。 实际上,甚至有人声称它可以很好地代替Python。

    Either way, while the example above doesn’t show it too much, Nim’s syntax is often very similar to Python’s.

    无论哪种方式,虽然上面的示例并没有太多说明,但Nim的语法通常与Python非常相似。

    As such, this example is not actually what I think is necessarily the best piece of syntax in Nim, since I would probably pick something inherited from Python, but rather something that I find quite interesting.

    因此,这个示例实际上并不是我认为一定是Nim中最好的语法,因为我可能会选择从Python继承的东西,而是我觉得很有趣的东西。

    I have very little experience with Nim, but one of the first things I learned is that it is case and underscore-insensitive (except for the first character).

    我对Nim的经验很少,但是我学到的第一件事是,它不区分大小写和下划线(除了第一个字符)。

    Thus, HelloWorld and helloWorld are different, but helloWorld , helloworld , and hello_world are all the same.

    因此, HelloWorldhelloWorld不同,但是helloWorldhelloworldhello_world都相同。

    At first I thought this could be problematic, but the Docs explains that this is helpful when using libraries that made use of a different style to yours, for example.

    起初我以为这可能会有问题,但是文档解释说,例如,在使用使用与您的样式不同的库时,这很有用。

    Since your own code should be consistent with itself, you most likely wouldn’t use camelCase and snake_case together anyway. However, this could be useful if, for instance, you want to port a library and keep the same names for the methods while being able to make use of your own style to call them.

    由于您自己的代码应与其自身保持一致,因此您极有可能不会一起使用camelCasesnake_case 。 但是,例如,当您想要移植一个库并为方法保留相同的名称,同时又能够使用自己的样式来调用它们时,这可能会很有用。

    Image for post

    在线组装 (In-line Assembly)

    function getTokenAddress() external view returns(address) { 
    address token;
    assembly {
    token := sload(0xffffffffffffffffffffffffffffffffffffffff)
    }
    return token
    }

    Solidity is the main language for writing smart contracts on the Ethereum blockchain.

    坚固性是在以太坊区块链上编写智能合约的主要语言。

    A big part of writing smart contracts is optimizing the code, since every operation on the Ethereum blockchain has an associated cost.

    编写智能合约的很大一部分是优化代码,因为以太坊区块链上的每项操作都有相关的成本。

    As such, I find the ability to add in-line Solidity assembly right there with your code extremely powerful, as it lets you get a little closer to the Ethereum Virtual Machine for optimizations where necessary.

    因此,我发现使用代码在其中添加内联Solidity程序集的功能非常强大,因为它使您可以更靠近以太坊虚拟机进行必要的优化。

    I also think it fits in very nicely within the assembly block.

    我也认为它非常适合assembly块。

    And, last but not least, it makes proxies possible, which is just awesome.

    最后但并非最不重要的一点是,它使代理成为可能,这真是太棒了。

    Image for post

    每次循环 (For-Each Loop)

    for (int num : nums) {
    doSomething(num);
    }

    In a language generally considered verbose, the for-each loop in Java is a breath of fresh air. I think it looks pretty clean and is quite readable (although not quite Python num in nums readable).

    在通常被认为是冗长的语言中,Java中的for-each循环是一口新鲜空气。 我认为它看起来很干净而且可读性很好(尽管可读性不是很好的Python num in nums )。

    Image for post

    巨集 (Macros)

    #define MAX_VALUE 10

    I got introduced to C-style macros when building my first Arduino project and for a while had no clue exactly what they did.

    在构建我的第一个Arduino项目时,我被引入了C风格的宏,有一阵子不知道它们到底做了什么。

    Nowadays, I have a better idea of what macros are and am quite happy with the way they are declared in C.

    如今,我对宏是什么有了更好的了解,并且对在C中声明宏的方式感到非常满意。

    Not hating on C by any means, but, like Java, there’s little about the actual syntax that stands out, so these last two ones are a little meh, unfortunately.

    绝不讨厌C,但是像Java一样,实际的语法几乎没有什么特色,因此不幸的是,最后两个语法有点差强人意

    Image for post

    “使用名称空间” (‘using namespace’)

    using namespace std;

    Sorry :(

    对不起:(

    And that’s it! So, what are your favorite pieces of syntax?

    就是这样! 那么, 最喜欢的语法是什么?

    作者注✍️ (Author’s Note ✍️)

    Thanks for reading! If you believe this article was useful, feel free to support me with some claps 👏👏.

    谢谢阅读! 如果您认为本文有用,请随时鼓掌支持我。

    And remember: We’re talking about syntax here — not features of languages.

    请记住:我们在这里谈论的是语法 -并不是语言的功能。

    This article is part of my series of Programming Tutorials. Here are some of the other tutorials on the list:

    本文是我的一系列编程教程的一部分。 这是列表中的其他一些教程:

    翻译自: https://medium.com/swlh/my-favorite-pieces-of-syntax-in-8-different-programming-languages-ba37b64fc232

    编程语言语法汇总

    展开全文
  • 语法和语义之间的差异Here you will learn about difference between Syntax and Semantics with example. 在这里,您将通过示例了解语法和语义之间的区别。 The both terms might seem the same things but they ...
  • A Few Useful Things to Know about Machine Learning Pedro Domingos A Few Useful Things to Know about Machine Learning - 原文 0 ABSTRACT 机器学习算法可以从例子中归纳出执行重要任务的方法。在不需要手工...
  • 英语基础语法

    千次阅读 多人点赞 2017-08-22 23:53:14
    I know nothing about this person 动词 1. 动词的过去式与过去分词 动词的过去式是一个动词,单独作谓语。不能与助动词、情态动词连用。它的词性与动词的第三人称单数一样。 而动词的过去分词是动词动词的过去式...
  • ⭐基础英语语法最全总结——英语小白必备笔记

    千次阅读 多人点赞 2020-05-25 12:06:28
    一样 2.9.3 “the 比”句型“越...,就越...” 2.10 方式状语从句 三、名词性从句 1、that的用法 2、whether/if 的用法 3、wh- 用法 3.1 主语从句 3.1.1 “所字结构”what作宾语 3.1.2 “使字结构”what作主语 3.1.3...
  • 自然语言处理语法解析My previous article discussed about the algorithms and some implementation details of the lexer and the parser. In this article, I will discuss how to handle syntax errors in a ...
  • 语法长难句

    千次阅读 2020-10-17 21:30:39
    一简单句 二并列句 三名词(短语)和名词性从句 四定语和定语从句 五状语和状语从句 ...必须具备主谓结构,并且主语一定是谓语动作的发出者,如果有宾语,宾语一定是谓语的动作的承受者。...中文的句子可以没有主语...
  • Swift 与Objective-c语法参照

    千次阅读 2015-04-24 15:11:35
    Swift已经推出了一段时间了,今天来总结一下Swift与Objective-c(以下简称OC)的语法有哪些不同。 1.常量与变量: 在Swift中定义常量和变量很简单,常量使用let关键字,变量使用var关键字。 var numberOfRows = 30...
  • 什么是语法错误?

    2020-10-04 23:48:55
    Cannot set property '...' of null " Error Solutions 了解更多JavaScript“ Uncaught TypeError:无法设置null的属性'...'错误解决方案 翻译自: https://www.poftut.com/what-is-syntax-error/
  • What’s all this fuss about Erlang

    千次阅读 2011-06-12 00:23:00
    转载:http://blog.sina.com.cn/s/blog_49faf32901008pap.html 作者:Joe Armstrong原文:What’s all this fuss about Erlang译者:朱照远(Joshua Zhu) 许式伟(XuShiWei)What’s all this fuss about Erlang?...
  • 第 1 关:What about making trans? 题目地址 http://www.pythonchallenge.com/pc/def/map.html 题目内容 everybody thinks twice before solving this. g fmnc wms bgblr rpylqjyrc gr zw fylb. rfyrq ...
  • word2016 语法检查Microsoft Word comes with a powerful grammar checker, but many of its advanced grammar detection features are disabled by default. Grammarly is popular, but you don’t need it to add ...
  • VB.NET 教程_01_基础语法

    万次阅读 多人点赞 2018-08-08 00:48:58
    当然,它们同属Basic系列语言,又同为微软所开发,语法上有一定的相似或沿袭是很正常的,但不能因此认为VB .NET是VB的升级版。 对于想从VB转到VB .NET的开发者来说,如果你只是喜欢Basic系列语言的代码风格,...
  • A.Giventhe observed increase of about 10%, the expected increase of PDI is about 40%, taking into account the increased duration of events. ( Nature 2005 ,436, 686-688) B. Taking the above ...
  • 视频链接(B站):... ... 视频讲的很清晰,梳理了我的语法体系,nice!(PS:适合一定英语基础的!!!) 一、按照结构分类时,所有英文句子能够被分为三类 1.简单句 主语+谓语...
  • SQL UPDATE语法说明

    2020-07-26 16:51:20
    UPDATE (Transact-SQL)UPDATE(Transact-SQL) The official documentation is a treasure trove of the UPDATE statement that will take you about 40 minutes to read but has everything that you need to know...
  • 概述 ...Markdown 的目标是实现「易读易写」。...一份使用 Markdown 格式撰写的文件应该可以直接以纯文本发布,并且看起来不会像是由许多标签...Markdown 语法受到一些既有 text-to-HTML 格式的影响,包括Setext、
  • 大学英语语法大回归

    千次阅读 2021-02-25 10:49:36
    It is used: ● with a time expression such as by, in, at or before to indicate when the action is going to be completed I*’ll have** finished using your laptop in about an hour.* ● to make ...
  • 各种语法分析方法

    千次阅读 2018-04-24 15:30:52
    about what production to apply, as the parse proceeds. LR(k)语法分析法 The 'most prevalent type of bottom-up parser today is based on a concept called LR(k) parsing; the "L" ...
  • 语法高亮 2.找到这个文件夹,新建python.uew 3.把下面这些玩意装进去并保存。 /L14"Python" PYTHON_LANG Line Comment = # Escape Char = \ String Literal Prefix = r File Extensions = PY PYW /...
  • TCL语法简述

    千次阅读 2019-03-26 19:23:58
    一 TCL语法 1 脚本、命令和单词符号 一个TCL脚本可以包含一个或多个命令。命令之间必须用换行符或分号隔开,下面的两个脚本都是合法的: set a 1 set b 2 或 set a 1;set b 2 TCL的每一个命令包含一个或...
  • 百词斩语法总结

    千次阅读 2017-04-10 15:42:53
    To express question, use when/where/why/how/who/whom/which/what Why he likes the girl is still a mystery.       Adj clause: decorate a noun relative pronoun: clause miss one part of ...
  • 一、万法归宗——英语语法速成入门 语法七要素一:词类(词性) 实词 1. 名词 n. 2. 动词 v.(vi. vt.) 3. 形容词 adj. 4. 副词 adv. 5. 代词 pron. 6. 数词 num. 虚词 1. 介词 prep. 2. 连词 conj. 3. 冠词 art. 4....
  • 语法长难句(刘晓艳

    千次阅读 2020-12-26 10:53:58
    写作 1)所有写不了的长短句暂时都写成简单句,必须保证语法正确 2)所有不会写的单词都可以写成自己会的词汇,反正老师也不知道怎么要表达什么 2.长短句分析 1)找动词(谓语) 2)找主谓宾 如果一句话找到多个动词...
  • Everyone runs into syntax errors. 每个人都遇到语法错误。 Even experienced programmers make typos. 即使是经验丰富的程序员也
  • Impala基础语法(二)

    千次阅读 2018-04-20 15:02:43
    Impala SQL 语言元素(Elements)Impala SQL 方言支持一组标准元素(a range of standard elements),加上许多大数据方面的扩展,用于数据加载和数据仓库方面。 注意:在之前的 Impala beta 版中,在 impala-shell ...
  • Markdown 语法

    千次阅读 2017-07-04 16:22:38
    Markdown 语法以下是 Markdown 的常用语法!在以后的笔记中将持续使用 Markdown 语法进行编译,因此,将此分享给大家。 概述 Markdown 的目标是实现 【易读易写】 Markdown 语法全由一些符号所组成,这些符号经过...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 17,428
精华内容 6,971
关键字:

about语法what