精华内容
参与话题
问答
  • Scala

    万次阅读 2017-11-21 11:29:24
    如果没有服务器环境,可以在本地搭建Scala开发环境,单机版,然后安装IDE编程工具,就可以在本地机器上进行scala程序的开发! 操作步骤1. 首先安装jdk1.8 配置环境变量等 jdk1.8下载地址 2. 安装scala环境,...

    一、windows下的scala

    如果没有服务器环境,可以在本地搭建Scala开发环境,单机版,然后安装IDE编程工具,就可以在本地机器上进行scala程序的开发!

    操作步骤

    1. 首先安装jdk1.8 配置环境变量等

    jdk1.8下载地址

    2. 安装scala环境,配置环境变量(2.10.5)

    scala2.10.5下载地址

    3. 安装spark,配置环境变量(spark1.6.1)

    spark1.6.1下载地址

    4. 安装hadoop,配置环境变量(hadoop2.6.0)

    hadoop2.6.0下载地址

    5. 安装scala IDE (2.12)

    scala IDE2.12下载地址

    重要提示

    注意:
    这里需要注意一点,scala版本必须是2.10.5,因为spark中也有scala,是2.10.5版本的,而使用IDE的时候,需要引入spark包,所以如果scala单独环境如果不是2.10.5,那么会提示scala版本不一致的错误,很麻烦!

    切记
    环境装完之后,启动scala IDE,配置build path,导入spark包,然后写程序!
    上面所有组件,只需要解压后,配置环境变量即可!

    二、linux下的scala

    当你想在linux端运行scala程序,或者想通过scala交互式操作来执行代码,首先得在linux端搭建Scala环境,下面来进行部署操作!

    操作流程

    1、下载文件scala2.11.6.tgz

    scala2.11.6下载地址,下载后上传到linux的opt目录下

    2、登陆linux,并解压

     # cd /opt
     # tar -zxf scala-2.11.6.tgz
     # mv scala-2.11.6 scala
    

    3、编辑配置文件添加scala的配置

     # vim /etc/profile
    
    export SCALA_HOME=/opt/scala
    export PATH=$PATH:$SCALA_HOME/bin
    
     # source /etc/profile          #使配置文件生效
    

    4、验证scala

     # scala -version
    Scala code runner version 2.11.6 -- Copyright 2002-2013, LAMP/EPFL
     
     # scala
    Welcome to Scala version 2.11.6 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_71).
    Type in expressions to have them evaluated.
    Type :help for more information.
    
    scala> 9*9
    res0: Int = 81
    

    注意

    注:使用scala之前,需要安装jdk
    jdk包下载地址

    三、demo

    当本地部署搭建完Scala开发环境后,那么就可以利用IDE开发工具进行Scala语言编程,编写一些工具等等,这里写一些简单的语法涉及,来熟悉下Scala。

    运行代码

    package epoint.com.cn.test001
    
    object test001 {
     def main(args: Array[String]) {
     val msg = "hello world"
     val greetStrings = Array("i", "like", "scala")
     println(msg)
     println(max(5, 6))
     greet()
     printargs(greetStrings)
    
    val oneTwo = List(1, 2)
     val threeFour = List(3, 4)
     val oneTwoThreeFour = oneTwo ::: threeFour
     println(oneTwo + " and " + threeFour + " were not mutated.")
     println("Thus, " + oneTwoThreeFour + " is a new list")
    
    val pair = (99, "Luftballons")
     println(pair._1)
     println(pair._2)
    
    var jetSet = Set("Boeing", "Airbus")
     jetSet += "Lear"
     println(jetSet.contains("Boeing"))
    
    val romanNumeral = Map(1 -> "I", 2 -> "II",
     3 -> "III", 4 -> "IV", 5 -> "V")
     println(romanNumeral(4))
    
    }
    
    def max(x: Int, y: Int): Int = {
     if (x > y) x
     else
     y
     }
    
    def greet() = println("xubin nihao")
    
    def printargs(args: Array[String]) {
     var i = 0
     while (i < args.length) {
     println(args(i))
     i += 1
     }
     }
    }
    

    输出打印:

    hello world
    6
    xubin nihao
    i
    like
    scala
    List(1, 2) and List(3, 4) were not mutated.
    Thus, List(1, 2, 3, 4) is a new list
    99
    Luftballons
    true
    IV
    

    这里写图片描述

    四、scala on spark

    spark是用scala写的一种极其强悍的计算工具,spark内存计算,提供了图计算,流式计算,机器学习,即时查询等十分方便的工具,所以利用scala来进行spark编程是十分必要的,下面简单书写一个spark连接mysql读取信息的例子。

    操作流程

    按照windows搭建Scala开发环境博文,搭建scala开发环境,实际已经将Spark环境部署完成了,所以直接可以用scala语言写一些spark相关的程序!

    package epoint.com.cn.test001
    
    import org.apache.spark.sql.SQLContext
    import org.apache.spark.SparkConf
    import org.apache.spark.SparkContext
    import org.apache.spark.rdd.RDD
    
    object SparkConnMysql {
     def main(args: Array[String]) {
     println("Hello, world!")
     val conf = new SparkConf()
     conf.setAppName("wow,my first spark app")
     conf.setMaster("local")
     val sc = new SparkContext(conf)
     val sqlContext = new SQLContext(sc)
     val url = "jdbc:mysql://192.168.114.67:3306/user"
     val table = "user"
     val reader = sqlContext.read.format("jdbc")
     reader.option("url", url)
     reader.option("dbtable", table)
     reader.option("driver", "com.mysql.jdbc.Driver")
     reader.option("user", "root")
     reader.option("password", "11111")
     val df = reader.load()
     df.show()
     }
    }
    

    运行结果:

    Hello, world!
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/D:/spark1.6/lib/spark-assembly-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/D:/spark1.6/lib/spark-examples-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/D:/kettle7.1/inceptor-driver.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    17/11/21 11:43:53 INFO SparkContext: Running Spark version 1.6.1
    17/11/21 11:43:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    17/11/21 11:43:56 INFO SecurityManager: Changing view acls to: lenovo
    17/11/21 11:43:56 INFO SecurityManager: Changing modify acls to: lenovo
    17/11/21 11:43:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(lenovo); users with modify permissions: Set(lenovo)
    17/11/21 11:43:59 INFO Utils: Successfully started service 'sparkDriver' on port 55824.
    17/11/21 11:43:59 INFO Slf4jLogger: Slf4jLogger started
    17/11/21 11:43:59 INFO Remoting: Starting remoting
    17/11/21 11:43:59 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.114.67:55837]
    17/11/21 11:43:59 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 55837.
    17/11/21 11:43:59 INFO SparkEnv: Registering MapOutputTracker
    17/11/21 11:43:59 INFO SparkEnv: Registering BlockManagerMaster
    17/11/21 11:43:59 INFO DiskBlockManager: Created local directory at C:\Users\lenovo\AppData\Local\Temp\blockmgr-16383e3c-7cb6-43c7-b300-ccc1a1561bb4
    17/11/21 11:43:59 INFO MemoryStore: MemoryStore started with capacity 1129.9 MB
    17/11/21 11:44:00 INFO SparkEnv: Registering OutputCommitCoordinator
    17/11/21 11:44:00 INFO Utils: Successfully started service 'SparkUI' on port 4040.
    17/11/21 11:44:00 INFO SparkUI: Started SparkUI at http://192.168.114.67:4040
    17/11/21 11:44:00 INFO Executor: Starting executor ID driver on host localhost
    17/11/21 11:44:00 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55844.
    17/11/21 11:44:00 INFO NettyBlockTransferService: Server created on 55844
    17/11/21 11:44:00 INFO BlockManagerMaster: Trying to register BlockManager
    17/11/21 11:44:00 INFO BlockManagerMasterEndpoint: Registering block manager localhost:55844 with 1129.9 MB RAM, BlockManagerId(driver, localhost, 55844)
    17/11/21 11:44:00 INFO BlockManagerMaster: Registered BlockManager
    17/11/21 11:44:05 INFO SparkContext: Starting job: show at SparkConnMysql.scala:25
    17/11/21 11:44:05 INFO DAGScheduler: Got job 0 (show at SparkConnMysql.scala:25) with 1 output partitions
    17/11/21 11:44:05 INFO DAGScheduler: Final stage: ResultStage 0 (show at SparkConnMysql.scala:25)
    17/11/21 11:44:05 INFO DAGScheduler: Parents of final stage: List()
    17/11/21 11:44:05 INFO DAGScheduler: Missing parents: List()
    17/11/21 11:44:05 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at show at SparkConnMysql.scala:25), which has no missing parents
    17/11/21 11:44:06 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 5.2 KB, free 5.2 KB)
    17/11/21 11:44:06 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.5 KB, free 7.7 KB)
    17/11/21 11:44:06 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:55844 (size: 2.5 KB, free: 1129.9 MB)
    17/11/21 11:44:06 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006
    17/11/21 11:44:06 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at show at SparkConnMysql.scala:25)
    17/11/21 11:44:06 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
    17/11/21 11:44:06 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 1922 bytes)
    17/11/21 11:44:06 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
    17/11/21 11:44:06 INFO JDBCRDD: closed connection
    17/11/21 11:44:06 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 3472 bytes result sent to driver
    17/11/21 11:44:06 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 224 ms on localhost (1/1)
    17/11/21 11:44:06 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
    17/11/21 11:44:06 INFO DAGScheduler: ResultStage 0 (show at SparkConnMysql.scala:25) finished in 0.261 s
    17/11/21 11:44:06 INFO DAGScheduler: Job 0 finished: show at SparkConnMysql.scala:25, took 1.467252 s
    +---+----+----+------------+------------------+---------+-------+
    | id|name| age|       phone|             email|startdate|enddate|
    +---+----+----+------------+------------------+---------+-------+
    | 11| 徐心三|  24|     2423424|    2423424@qq.com|     null|   null|
    | 33| 徐心七|  23|    23232323|          13131@qe|     null|   null|
    | 55|  徐彬|  22| 15262301036|徐彬757661238@ww.com|     null|   null|
    | 44|  徐成|3333| 23423424332|    2342423@qq.com|     null|   null|
    | 66| 徐心四|  23|242342342423|    徐彬23424@qq.com|     null|   null|
    | 11| 徐心三|  24|     2423424|    2423424@qq.com|     null|   null|
    | 33| 徐心七|  23|    23232323|          13131@qe|     null|   null|
    | 55|  徐彬|  22| 15262301036|徐彬757661238@ww.com|     null|   null|
    | 44|  徐成|3333| 23423424332|    2342423@qq.com|     null|   null|
    | 66| 徐心四|  23|242342342423|    徐彬23424@qq.com|     null|   null|
    | 88| 徐心八| 123|   131231312|       123123@qeqe|     null|   null|
    | 99| 徐心二|  23|    13131313|   1313133@qeq.com|     null|   null|
    |121| 徐心五|  13|   123131231|    1231312@qq.com|     null|   null|
    |143| 徐心九|  23|      234234|        徐彬234@wrwr|     null|   null|
    +---+----+----+------------+------------------+---------+-------+
    only showing top 14 rows
    
    17/11/21 11:44:06 INFO SparkContext: Invoking stop() from shutdown hook
    17/11/21 11:44:06 INFO SparkUI: Stopped Spark web UI at http://192.168.114.67:4040
    17/11/21 11:44:06 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
    17/11/21 11:44:06 INFO MemoryStore: MemoryStore cleared
    17/11/21 11:44:06 INFO BlockManager: BlockManager stopped
    17/11/21 11:44:06 INFO BlockManagerMaster: BlockManagerMaster stopped
    17/11/21 11:44:06 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
    17/11/21 11:44:06 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
    17/11/21 11:44:06 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
    17/11/21 11:44:06 INFO SparkContext: Successfully stopped SparkContext
    17/11/21 11:44:07 INFO ShutdownHookManager: Shutdown hook called
    17/11/21 11:44:07 INFO ShutdownHookManager: Deleting directory C:\Users\lenovo\AppData\Local\Temp\spark-7877d903-f8f7-4efb-9e0c-7a11ac147153
    
    展开全文
  • scala

    千次阅读 2020-08-24 20:25:29
    scala> val a = List(1,2,3,4)a: List[Int] = List(1, 2, 3, 4) ​scala> println(a.toString)List(1, 2, 3, 4) 生成字符串 mkString方法,可以将元素以分隔符拼接起来。默认没有分隔符 参考代码 ...

    转换字符串

    toString方法可以返回List中的所有元素

    参考代码

    scala> val a = List(1,2,3,4)
    a: List[Int] = List(1, 2, 3, 4)

    scala> println(a.toString)
    List(1, 2, 3, 4)

    生成字符串

    mkString方法,可以将元素以分隔符拼接起来。默认没有分隔符

    参考代码

    scala> val a = List(1,2,3,4)
    a: List[Int] = List(1, 2, 3, 4)

    scala> a.mkString
    res7: String = 1234

    scala> a.mkString(":")
    res8: String = 1:2:3:4

    并集

    union表示对两个列表取并集,不去重

    参考代码

    scala> val a1 = List(1,2,3,4)
    a1: List[Int] = List(1, 2, 3, 4)

    scala> val a2 = List(3,4,5,6)
    a2: List[Int] = List(3, 4, 5, 6)

    // 并集操作
    scala> a1.union(a2)
    res17: List[Int] = List(1, 2, 3, 4, 3, 4, 5, 6)

    // 可以调用distinct去重
    scala> a1.union(a2).distinct
    res18: List[Int] = List(1, 2, 3, 4, 5, 6)

    交集

    intersect表示对两个列表取交集

    scala> val a1 = List(1,2,3,4)
    a1: List[Int] = List(1, 2, 3, 4)

    scala> val a2 = List(3,4,5,6)
    a2: List[Int] = List(3, 4, 5, 6)

    scala> a1.intersect(a2)
    res19: List[Int] = List(3, 4)

    差集

    diff表示对两个列表取差集,例如: a1.diff(a2),表示获取a1在a2中不存在的元素

    scala> val a1 = List(1,2,3,4)
    a1: List[Int] = List(1, 2, 3, 4)

    scala> val a2 = List(3,4,5,6)
    a2: List[Int] = List(3, 4, 5, 6)

    scala> a1.diff(a2)
    res24: List[Int] = List(1, 2)

    Set

    Set()是代表没有重复元素的集合。Set具备以下性质:

    1. 元素不重复
    2. 不保证插入顺序

    scala中的集也分为两种,一种是不可变集,另一种是可变集。

    不可变集

    语法

    创建一个空的不可变集,语法格式:

    val/var 变量名 = Set[类型]()

    给定元素来创建一个不可变集,语法格式:

    val/var 变量名 = Set(元素1, 元素2, 元素3...)

    示例一

    定义一个空的不可变集

    参考代码

    scala> val a = Set[Int]()
    a: scala.collection.immutable.Set[Int] = Set()

    示例二

    定义一个不可变集,保存以下元素:1,1,1,1,1,3,2,4,8

    参考代码

    scala> val a = Set(1,1,1,1,1,3,2,4,8)
    a: scala.collection.immutable.Set[Int] = Set(1, 2, 3, 8, 4)

    基本操作

    1. 获取集的大小(size
    2. 遍历集(和遍历数组一致
    3. 添加一个元素,生成一个Set+
    4. 拼接两个集,生成一个Set++
    5. 拼接集和列表,生成一个Set++

    示例


    参考代码

    // 创建集
    scala> val a = Set(1,1,2,3,4,5)
    a: scala.collection.immutable.Set[Int] = Set(5, 1, 2, 3, 4)

    // 获取集的大小
    scala> a.size
    res0: Int = 5

    // 遍历集
    scala> for(i <- a) println(i)

    // 删除一个元素
    scala> a - 1
    res5: scala.collection.immutable.Set[Int] = Set(5, 2, 3, 4)

    // 拼接两个集
    scala> a ++ Set(6,7,8)
    res2: scala.collection.immutable.Set[Int] = Set(5, 1, 6, 2, 7, 3, 8, 4)

    // 拼接集和列表
    scala> a ++ List(6,7,8,9)
    res6: scala.collection.immutable.Set[Int] = Set(5, 1, 6, 9, 2, 7, 3, 8, 4)

    可变集

    定义

    可变集合不可变集的创建方式一致,只不过需要提前导入一个可变集类。

    手动导入:import scala.collection.mutable.Set

    参考代码

    scala> val a = Set(1,2,3,4)
    a: scala.collection.mutable.Set[Int] = Set(1, 2, 3, 4)                          

    // 添加元素
    scala> a += 5
    res25: a.type = Set(1, 5, 2, 3, 4)

    // 删除元素s
    scala> a -= 1
    res26: a.type = Set(5, 2, 3, 4)

    映射

    Map可以称之为映射。它是由键值对组成的集合。在scala中,Map也分为不可变Map和可变Map。

    不可变Map

    语法

    val/var map = Map(->, ->, ->...) // 推荐,可读性更好
    val/var map = Map((, ), (, ), (, ), (, )...)

    参考代码

    scala> val map = Map("zhangsan"->30, "lisi"->40)
    map: scala.collection.immutable.Map[String,Int] = Map(zhangsan -> 30, lisi -> 40)

    scala> val map = Map(("zhangsan", 30), ("lisi", 30))
    map: scala.collection.immutable.Map[String,Int] = Map(zhangsan -> 30, lisi -> 30)

    // 根据key获取value
    scala> map("zhangsan")
    res10: Int = 30

    修改

    scala> map("zhangsan")=33

    <console>:11: error: value update is not a member of scala.collection.immutable.Map[String,Int]

                  map("zhangsan")=33

    可变Map

    定义

    定义语法与不可变Map一致。但定义可变Map需要手动导入import scala.collection.mutable.Map

    scala> val map = Map("zhangsan"->30, "lisi"->40)
    map: scala.collection.mutable.Map[String,Int] = Map(lisi -> 40, zhangsan -> 30)

    // 修改value
    scala> map("zhangsan") = 20

    Map基本操作

    基本操作

    1. 获取值(map(key))
    2. 获取所有keymap.keys
    3. 获取所有valuemap.values
    4. 遍历map集合
    5. getOrElse
    6. 增加key,value
    7. 删除key

    示例

    参考代码

    scala> val map = Map("zhangsan"->30, "lisi"->40)
    map: scala.collection.mutable.Map[String,Int] = Map(lisi -> 40, zhangsan -> 30)

    // 获取zhagnsan的年龄
    scala> map("zhangsan")
    res10: Int = 30

    // 获取所有的学生姓名
    scala> map.keys
    res13: Iterable[String] = Set(lisi, zhangsan)

    // 获取所有的学生年龄
    scala> map.values
    res14: Iterable[Int] = HashMap(40, 30)

    // 打印所有的学生姓名和年龄
    scala> for((x,y) <- map) println(s"$x $y")
    lisi 40
    zhangsan 30

    // 获取wangwu的年龄,如果wangwu不存在,则返回-1
    scala> map.getOrElse("wangwu", -1)
    res17: Int = -1

    // 新增一个学生:wangwu, 35
    scala> map += "wangwu"->35
    res22: scala.collection.mutable.Map[String,Int] = Map(lisi -> 40, zhangsan -> 30, wangwu -> 35)

    // lisi从可变映射中移除
    scala> map -= "lisi"
    res23: scala.collection.mutable.Map[String,Int] = Map(zhangsan -> 30)

    iterator迭代器

    scala针对每一类集合都提供了一个迭代器(iterator)用来迭代访问集合

    使用迭代器遍历集合

    1. 使用iterator方法可以从集合获取一个迭代器
    2. 迭代器的两个基本操作
      1. hasNext——查询容器中是否有下一个元素
      2. next——返回迭代器的下一个元素,如果没有,抛出NoSuchElementException
    3. 每一个迭代器都是有状态的
    4. 迭代完后保留在最后一个元素的位置
    5. 再次使用则抛出NoSuchElementException
    6. 可以使用while或者for来逐个返回元素

    示例

    1. 定义一个列表,包含以下元素:1,2,3,4,5
    2. 使用while循环和迭代器,遍历打印该列表

    参考代码

    scala> var a=List(1,2,3,4,5)

    a: List[Int] = List(1, 2, 3, 4, 5)

     

    scala> val ite = a.iterator
    ite: Iterator[Int] = non-empty iterator

    scala> while(ite.hasNext){

    println(ite.next)

    }

    示例

    1. 定义一个列表,包含以下元素:1,2,3,4,5
    2. 使用for 表达式和迭代器,遍历打印该列表

    参考代码

    scala> val a = List(1,2,3,4,5)
    a: List[Int] = List(1, 2, 3, 4, 5)

    scala> for(i <- a) println(i)

     

    展开全文

空空如也

1 2 3 4 5 ... 20
收藏数 84,674
精华内容 33,869
关键字:

scala