精华内容
下载资源
问答
  • 该模板是用于Elsevier - Pattern Recognition Letters投稿的模板,直接用overleaf创建新的空白工程,然后将压缩包解压,upload到工程里面即可使用。
  • Statistical Pattern Recognition
  • Pattern classification 2nd edition Duda 课后答案(全)
  • 模式识别方面的经典教材,模式识别/分类Pattern Classification (DHS)英文版原著+中文版翻译+课后答案分享【第二版】
  • Design*Pattern*Framework*4.5

    2016-09-24 13:42:22
    Design*Pattern*Framework*4.5
  • Pattern Recognition and Machine Learning. Bishop. 英文版 完整版pdf
  • 机器学习经典书籍,PRML,Pattern Recognition And Machine Learning,和课后习题答案
  • Pattern Classification 模式分类。第二版。英文版 作者 duda
  • MapReduce Design Pattern

    2018-03-02 08:51:14
    MapReduce Design Pattern MapReduce Design Pattern MapReduce Design Pattern
  • Bayer Pattern to RGB

    2013-03-06 09:15:09
    Bayer Pattern to RGB
  • Pattern Recognition and Machine Learning 课本+课后习题完整答案,配合着看,enjoy it~
  • flink cep pattern动态加载

    万次阅读 2020-03-05 11:27:31
    pattern字符串转换成pattern对象 val pattern = transPattern(patternStr) 现在先看第2步,假定现在有了patternStr,如何转成pattern对象? str->obj,第一个能想到的方案就是使用javax.script.ScriptEngine调用...

    通常我们在提交一个flink cep任务,流程基本上是:开发,打包,部署;例如我们有一个任务:计算在60秒内,连续两次登陆失败的用户

    begin("begin").where(_.status=='fail').next("next").where(_.status=="fail").within(Time.seconds(60))

    然后又来一个任务:计算60秒内,用户登陆失败1次,然后第二次登陆成功的用户

    begin("begin").where(_.status=='fail').next("next").where(_.status=="success").within(Time.seconds(60))

    这两个任务,数据的输入,输出都是一样的,唯一的区别就是pattern不同;往常的话我们要重复之前的3个步骤才能完成任务的上线;如果我们能根据flink 任务传入参数,动态生成pattern对象,就能简化任务的上线流程,画个图

    如何实现pattern动态加载?为了实现这个功能,可以拆分成两个步骤

    1.根据pattern规则转换成pattern字符串
    val patternStr="begin(\"begin\").where(_.status==\"fail\")
            .next(\"next\").where(_.status==\"fail\").within(Time.seconds(60))"
    2.将pattern字符串转换成pattern对象
    val pattern = transPattern(patternStr)

    现在先看第2步,假定现在有了patternStr,如何转成pattern对象?

    str->obj,第一个能想到的方案就是使用javax.script.ScriptEngine调用groovy脚本的方法生成pattern对象;

    groovy 脚本如下:

    import com.hhz.flink.cep.pojo.LoginEvent
    import com.hhz.flink.cep.patterns.conditions.LogEventCondition
    import org.apache.flink.cep.scala.pattern.Pattern
    import org.apache.flink.streaming.api.windowing.time.Time
    def getP(){
        return Pattern.<LoginEvent>begin("begin")
        .where(new LogEventCondition("getField(eventType)==\"fail\""))
        .next("next").where(new LogEventCondition("getField(eventType)==\"fail\""))
        .times(2).within(Time.seconds(3))
    }

    这个脚本可以以字符串的方式通过groovy脚本引擎加载到内存中,并使用invokeFunction调用getP()方法,就可以返回pattern对象,伪代码如下

    String script="def getP(){return Pattern.<>....within(Time.seconds(3)))}";
    ScriptEngineManager factory = new ScriptEngineManager();
    ScriptEngine engine =  factory.getEngineByName("groovy");
    engine.eval(script);
    Invocable inv = (Invocable) engine;
    Pattern<LoginEvent, LoginEvent> pattern
        = (Pattern<LoginEvent, LoginEvent>) invocable.invokeFunction("getP");

    现在回过头来看第一步:根据pattern规则转换成pattern字符串

    在scala中pattern代码如下

    begin("begin").where(_.status=='fail').next("next").where(_.status=="fail").within(Time.seconds(60))

    where方法可以接受表达式,例如"_.status=='fail'",同时他也可以接受一个SimpleCondition对象,例如

    where(new SimpleCondition<LoginEvent>() {
                @Override
                public boolean filter(LoginEvent event) {
                    return event.eventType() == "fail";
                }
            })

    但在groovy中不支持接受"_.status=='fail'"表达式作为函数的参数,所以在生成pattern串是必须将where中的表达式换成SimpleCondition对象;

    那问题来了,换成SimpleCondition对象后,我们就得在filter方法中实现表达式的逻辑;这显然不是我们所需要的,这样做的话,patternStr的维护的成本就太高了;如果我们将表达式以字符串的形式传入到SimpleCondition对象中,然后在filter中自动计算表达式的值,就像

    where(new LogEventCondition("getField(eventType)==\"fail\""))

    filter方法根据表达式getField(eventType)==\"fail\"计算结果,返回ture或false,难点来了,如何根据表达式计算结果,这里就需要引入aviator包,关于aviator我们看几个样例

    import com.googlecode.aviator.AviatorEvaluator;
    public class TestAviator {
        public static void main(String[] args) {
            Long result = (Long) AviatorEvaluator.execute("1+2+3");
            System.out.println("-------"+result);
        }
    }
    结果输出:
    -------6

    具体看下LogEventCondition

    public class LogEventCondition  extends SimpleCondition<LoginEvent> implements Serializable {
        private String script;
    
        static {
            AviatorEvaluator.addFunction(new GetFieldFunction());
        }
    
        //getField(eventType)==\"fail\"
        public LogEventCondition(String script){
            this.script = script;
        }
    
        @Override
        public boolean filter(LoginEvent value) throws Exception {
            Map<String, Object> stringObjectMap = Obj2Map.objectToMap(value);
            //计算表达式的值
            boolean result = (Boolean) AviatorEvaluator.execute(script, stringObjectMap);
            return result;
        }
    
    }

    到这里,cep pattern动态加载就介绍完了;起初我也是想像某些大厂一样,通过订制一套特有的DSL语法,然后将DSL语句解析转换成pattern,这样的话,非开发同学也就能够在公司的数据平台直接订制实时计算任务;但回想之前给非研发同学培训sql的悲惨经历,放弃了;觉得pattern订制的学习成本还是有点高,交给运营或者产品去搞这事,不靠谱,所以这事还是得研发来干;对于研发来讲pattern最原生的规则就是最好的~~~

    ------------20210917  更新------------------

    补齐下之前遗漏的代码

    package com.hhz.flink.cep.patterns.aviator;
    
    import com.googlecode.aviator.AviatorEvaluator;
    import com.googlecode.aviator.runtime.function.FunctionUtils;
    import com.googlecode.aviator.runtime.type.AviatorJavaType;
    import com.googlecode.aviator.runtime.type.AviatorObject;
    import com.googlecode.aviator.runtime.type.AviatorString;
    
    import java.util.HashMap;
    import java.util.Map;
    
    public class GetFieldFunction extends HhzFieldFunction {
    
        @Override
        public String getName() {
            return "getField";
        }
    
        @Override
        public AviatorString call(Map<String, Object> params, AviatorObject arg1) {
    
            AviatorJavaType field = (AviatorJavaType)arg1;
            String name = field.getName();
    
            if(name.contains(".")){
                return new AviatorString(jsonValue(name, params));
            }
    
            String stringValue = FunctionUtils.getStringValue(arg1, params);
            return new AviatorString(stringValue);
        }
    
        public static void main(String[] args) {
            AviatorEvaluator.addFunction(new GetInt());
            Map<String,Object> m = new HashMap<String, Object>();
            m.put("person","{age:12,name:\"zhangsan\"}");
            System.out.println(AviatorEvaluator.execute("getInt(person.age)<12", m));
        }
    }
    
    
    
    
    
    
      
    package com.hhz.flink.cep.patterns.aviator;
    
    import com.alibaba.fastjson.JSONObject;
    import com.googlecode.aviator.runtime.function.AbstractFunction;
    
    import java.util.Map;
    
    public abstract class HhzFieldFunction extends AbstractFunction {
    
    
        public String jsonValue(String fieldName, Map<String, Object> params){
            String[] arr = fieldName.split("\\.");
            String json = params.get(arr[0]).toString();
            JSONObject object = JSONObject.parseObject(json);
            return object.getString(arr[1]);
        }
    }

    展开全文
  • datapattern.js

    热门讨论 2013-04-02 23:19:55
    Json格式时间转换为为正常时间的JS代码,时间格式代码Json序列化
  • at java.util.regex.Pattern$CharProperty.match(Pattern.java:3776) at java.util.regex.Pattern$Curly.match0(Pattern.java:4260) at java.util.regex.Pattern$Curly.match(Pattern.java:4234) at java.util....

    先贴出异常

    服务器的CPU使用率一直保持在10%左右,最近突然激增,没有下降的趋势,而且重启后依然攀升,至死方休O/\O,查看进程

    是java进程跑死的,只能找java进程下哪些线程高cpu使用率

    问题线程已经找到

    USER     %CPU PRI SCNT WCHAN  USER SYSTEM    TID     TIME
    [root@27a2a017-029b-48d8-bee2-255610ec9649 ~]# ps -mp 190180 -o THREAD,tid,time|uniq -c|sort -nr
          1 USER     %CPU PRI SCNT WCHAN  USER SYSTEM    TID     TIME
          1 admin    83.8  19    - -         -      - 231456 1-00:42:04
          1 admin    82.4  19    - -         -      - 390319 16:22:37
          1 admin    62.3   -    - -         -      -      - 11-02:21:35
          1 admin    44.3  19    - -         -      - 390320 08:48:52
          1 admin     3.4  19    - futex_    -      -  60245 00:00:34
          1 admin     2.8  19    - futex_    -      -  60243 00:00:28
          1 admin     2.7  19    - futex_    -      -  50481 00:01:25
          1 admin     2.6  19    - futex_    -      -  57651 00:00:42
          1 admin     2.5  19    - futex_    -      -  61324 00:00:19
          1 admin    25.0  19    - -         -      -  63595 3-17:04:59
          1 admin     2.3  19    - futex_    -      -  61321 00:00:18
          1 admin     2.3  19    - futex_    -      -  60244 00:00:24
          1 admin     2.3  19    - futex_    -      -  45505 00:01:38
          1 admin     2.2  19    - futex_    -      -  57650 00:00:35
          1 admin     1.8  19    - futex_    -      -  61327 00:00:14
          1 admin     1.8  19    - futex_    -      -  61320 00:00:14
          1 admin     1.5  19    - -         -      -  60189 00:00:15

    找线程的堆栈信息(TID记得转16进制:390319=>5f4af)

    [admin@27a2a017-029b-48d8-bee2-255610ec9649 ~]jstack 190180 |grep 5f4af -A 500
    "JSF-BZ-22000-99-T-1114" #45190 daemon prio=5 os_prio=0 tid=0x00007fd36400f800 nid=0x5f4af runnable [0x00007fd35ccd2000]
       java.lang.Thread.State: RUNNABLE
            at java.util.regex.Pattern$5.isSatisfiedBy(Pattern.java:5251)
            at java.util.regex.Pattern$5.isSatisfiedBy(Pattern.java:5251)
            at java.util.regex.Pattern$5.isSatisfiedBy(Pattern.java:5251)
            at java.util.regex.Pattern$5.isSatisfiedBy(Pattern.java:5251)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3776)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4260)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.matchInit(Pattern.java:4804)
            at java.util.regex.Pattern$Prolog.match(Pattern.java:4741)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupCurly.match0(Pattern.java:4485)
            at java.util.regex.Pattern$GroupCurly.match(Pattern.java:4405)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4794)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4279)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.match(Pattern.java:4785)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$CharProperty.match(Pattern.java:3777)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Loop.matchInit(Pattern.java:4801)
            at java.util.regex.Pattern$Prolog.match(Pattern.java:4741)
            at java.util.regex.Pattern$GroupTail.match(Pattern.java:4717)
            at java.util.regex.Pattern$BranchConn.match(Pattern.java:4568)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4272)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3798)
            at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3798)
            at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3798)
            at java.util.regex.Pattern$Curly.match0(Pattern.java:4247)
            at java.util.regex.Pattern$Curly.match(Pattern.java:4234)
            at java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3798)
            at java.util.regex.Pattern$Branch.match(Pattern.java:4604)
            at java.util.regex.Pattern$GroupHead.match(Pattern.java:4658)
            at java.util.regex.Pattern$Begin.match(Pattern.java:3525)
            at java.util.regex.Matcher.match(Matcher.java:1270)
            at java.util.regex.Matcher.matches(Matcher.java:604)
            at com.jd.union.open.gateway.common.util.RegexUtils.verifyUrl(RegexUtils.java:14)
            at com.jd.union.open.gateway.service.api.goods.GoodsServiceImpl.queryLinkGoods(GoodsServiceImpl.java:664)
            at com.jd.union.open.gateway.service.api.goods.GoodsServiceImpl$$FastClassBySpringCGLIB$$1f88f92c.invoke(<generated>)
            at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204)
            at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:738)
            at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157)
            at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
            at com.jd.union.open.gateway.common.aop.JOpenAspect.execJOpenAspect(JOpenAspect.java:188)
            at sun.reflect.GeneratedMethodAccessor100.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:497)
            at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:629)
            at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:618)
            at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70)
            at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:168)
            at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85)
            at com.jd.ump.annotation.JAnnotation.execJAnnotation(JAnnotation.java:105)
            at sun.reflect.GeneratedMethodAccessor99.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:497)
            at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:629)
            at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:618)
            at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70)
            at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:168)
            at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
            at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
            at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:673)
            at com.jd.union.open.gateway.service.api.goods.GoodsServiceImpl$$EnhancerBySpringCGLIB$$9bc334b3.queryLinkGoods(<generated>)
            at sun.reflect.GeneratedMethodAccessor346.invoke(Unknown Source)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:497)
            at com.jd.jsf.gd.filter.ProviderInvokeFilter.reflectInvoke(ProviderInvokeFilter.java:140)
            at com.jd.jsf.gd.filter.ProviderInvokeFilter.invoke(ProviderInvokeFilter.java:100)
            at com.jd.union.open.gateway.common.filter.GatewayFilter.invoke(GatewayFilter.java:20)
            at com.jd.jsf.gd.filter.ProviderConcurrentsFilter.invoke(ProviderConcurrentsFilter.java:62)
            at com.jd.jsf.gd.filter.ProviderTimeoutFilter.invoke(ProviderTimeoutFilter.java:39)
            at com.jd.jsf.gd.filter.ProviderMethodCheckFilter.invoke(ProviderMethodCheckFilter.java:78)
            at com.jd.jsf.gd.filter.ProviderInvokeLimitFilter.invoke(ProviderInvokeLimitFilter.java:54)
            at com.jd.jsf.gd.filter.ProviderHttpGWFilter.invoke(ProviderHttpGWFilter.java:47)
            at com.jd.jsf.gd.filter.ProviderGenericFilter.invoke(ProviderGenericFilter.java:99)
            at com.jd.jsf.gd.filter.ProviderContextFilter.invoke(ProviderContextFilter.java:73)
            at com.jd.jsf.gd.filter.ExceptionFilter.invoke(ExceptionFilter.java:49)
            at com.jd.jsf.gd.filter.SystemTimeCheckFilter.invoke(SystemTimeCheckFilter.java:79)
            at com.jd.jsf.gd.filter.FilterChain.invoke(FilterChain.java:275)
            at com.jd.jsf.gd.server.ProviderProxyInvoker.invoke(ProviderProxyInvoker.java:67)
            at com.jd.jsf.gd.server.JSFTask.doRun(JSFTask.java:123)
            at com.jd.jsf.gd.server.BaseTask.run(BaseTask.java:27)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            at java.lang.Thread.run(Thread.java:745)

    问题已经很明确,是代码的问题(通过以上方式可以解决各种问题),正则表达式校验的重复执行导致高CPU。

    正则如下:

    private static final Pattern PATTERN_URL = Pattern.compile( "^([hH][tT]{2}[pP]:/*|[hH][tT]{2}[pP][sS]:/*|[fF][tT][pP]:/*)(([A-Za-z0-9-~]+).)+([A-Za-z0-9-~\\/])+(\\?{0,1}(([A-Za-z0-9-~]+\\={0,1})([A-Za-z0-9-~]*)\\&{0,1})*)$");

    导致问题的数据

    https://coupon.m.watereasy.com/coupons/show.action?key=aec58667c0c14d7bae24vbb73b83efcv&amp;roleId=278889568&amp;mall.watereasy.com/index-5935345.html\n\n100-50香蕉

     

    正则修改完毕(网上的url正则或多或少都有问题,还是根据自身业务按需设置,又是中文造成的问题)

    private static final Pattern PATTERN_URL= Pattern.compile( "^(([hH][tT]{2}[pP]|[hH][tT]{2}[pP][sS])://)(([A-Za-z0-9-~]+).)+([A-Za-z0-9-~\\/])+([\\w./?%&=\\u4e00-\\u9fa5]+)$");

     

    这个表达式99.99%执行时是没有问题的,但总有一个线程以某种方式“逃离”整个服务器,必须手动杀死它(否则最终导致堆栈溢出)

    问题的根源在于:此正则表达式具有经典:/ ^(A *)* $/形式,请注意,仅当模式与目标字符串不匹配时才会发生此失控行为。

    失控模式:^(A*|B*|C*|D*)*$有几个方式可以解决它:

    •  ^(A|B|C|D)*$  –  从组中的四个备选项中的每一个中删除星号(“零或更多”量词)。
    •  ^(A*+|B*+|C*+|D*+)*$ – 使每个替代星号量词占有(即将每个*改为*)。
    •  ^(?>A*|B*|C*|D*)*$ – 使包含替代原子的组成为原子。

    第二个应该比第一个执行得快一点,但这三个都将解决“正则表达式疯狂”的问题。 (最好不用正则表达式解析HTML)

    对于我来说,只能以简化的表达式修改此问题。

    此问题的更多解决方式:

    https://stackoverflow.com/questions/19990609/regex-gone-wild-java-util-regex-pattern-matcher-goes-into-high-cpu-loop

    《さくら》 、《さくら ~あなたに出会えてよかった~》 、《桜ひらり》

    展开全文
  • java之Pattern类详解

    万次阅读 2018-06-22 10:02:11
    原文链接:https://www.cnblogs.com/SQP51312/p/6136304.html在JDK 1.4中,Java增加了对正则表达式的支持。java与正则相关的工具主要在...Pattern 声明:public final class Pattern implements java.io.Seri...

    原文链接:https://www.cnblogs.com/SQP51312/p/6136304.html


    在JDK 1.4中,Java增加了对正则表达式的支持。

    java与正则相关的工具主要在java.util.regex包中;此包中主要有两个类:PatternMatcher


    Pattern 

    声明:public final class Pattern  implements java.io.Serializable

    Pattern类有final 修饰,可知他不能被子类继承。

    含义:模式类,正则表达式的编译表示形式。

    注意:此类的实例是不可变的,可供多个并发线程安全使用。


    字段:

    复制代码
     public static final int UNIX_LINES = 0x01;
    
        /**
         * 启用不区分大小写的匹配。*/
        public static final int CASE_INSENSITIVE = 0x02;
    
        /**
         * 模式中允许空白和注释。
         */
        public static final int COMMENTS = 0x04;
    
        /**
         * 启用多行模式。
         */
        public static final int MULTILINE = 0x08;
    
        /**
         * 启用模式的字面值解析。*/
        public static final int LITERAL = 0x10;
    
        /**
         * 启用 dotall 模式。
         */
        public static final int DOTALL = 0x20;
    
        /**
         * 启用 Unicode 感知的大小写折叠。*/
        public static final int UNICODE_CASE = 0x40;
    
        /**
         *  启用规范等价。
         */
        public static final int CANON_EQ = 0x80;
        private static final long serialVersionUID = 5073258162644648461L;
    
        /**
         * The original regular-expression pattern string.
         */
        private String pattern;
    
        /**
         * The original pattern flags.
         */
        private int flags;
    
        /**
         * Boolean indicating this Pattern is compiled; this is necessary in order
         * to lazily compile deserialized Patterns.
         */
        private transient volatile boolean compiled = false;
    
        /**
         * The normalized pattern string.
         */
        private transient String normalizedPattern;
    
        /**
         * The starting point of state machine for the find operation.  This allows
         * a match to start anywhere in the input.
         */
        transient Node root;
    
        /**
         * The root of object tree for a match operation.  The pattern is matched
         * at the beginning.  This may include a find that uses BnM or a First
         * node.
         */
        transient Node matchRoot;
    
        /**
         * Temporary storage used by parsing pattern slice.
         */
        transient int[] buffer;
    
        /**
         * Temporary storage used while parsing group references.
         */
        transient GroupHead[] groupNodes;
    
        /**
         * Temporary null terminated code point array used by pattern compiling.
         */
        private transient int[] temp;
    
        /**
         * The number of capturing groups in this Pattern. Used by matchers to
         * allocate storage needed to perform a match.此模式中的捕获组的数目。
         */
        transient int capturingGroupCount;
    
        /**
         * The local variable count used by parsing tree. Used by matchers to
         * allocate storage needed to perform a match.
         */
        transient int localCount;
    
        /**
         * Index into the pattern string that keeps track of how much has been
         * parsed.
         */
        private transient int cursor;
    
        /**
         * Holds the length of the pattern string.
         */
        private transient int patternLength;
    复制代码

     


    组和捕获

    捕获组可以通过从左到右计算其开括号来编号。

    在表达式 ((A)(B(C))) 中,存在四个组: 

    1ABC
    2A
    3BC
    4C

    组零始终代表整个表达式。 


    构造器

    复制代码
        private Pattern(String p, int f) {
            pattern = p;
            flags = f;
    
            // Reset group index count
            capturingGroupCount = 1;
            localCount = 0;
    
            if (pattern.length() > 0) {
                compile();
            } else {
                root = new Start(lastAccept);
                matchRoot = lastAccept;
            }
        }
    复制代码

    构造器是私有的,可知不能通过new创建Pattern对象。

    如何得到Pattern类的实例?

    查阅所有方法后发现:

        public static Pattern compile(String regex) {
            return new Pattern(regex, 0);
        }
        public static Pattern compile(String regex, int flags) {
            return new Pattern(regex, flags);
        }

    可知是通过Pattern调用静态方法compile返回Pattern实例。


    其他部分方法: 

    1、public Matcher matcher(CharSequence input)

    创建匹配给定输入与此模式的匹配器,返回此模式的新匹配器。

    复制代码
        public Matcher matcher(CharSequence input) {
        if (!compiled) {
            synchronized(this) {
            if (!compiled)
                compile();
            }
        }
            Matcher m = new Matcher(this, input);
            return m;
        }
    复制代码

    2、public static boolean matches(String regex,CharSequence input)

    编译给定正则表达式并尝试将给定输入与其匹配。

        public static boolean matches(String regex, CharSequence input) {
            Pattern p = Pattern.compile(regex);
            Matcher m = p.matcher(input);
            return m.matches();
        }

    测试:

    代码1(参考JDK API 1.6例子):

            Pattern p = Pattern.compile("a*b");
            Matcher m = p.matcher("aaaaab");
            boolean b = m.matches();
            System.out.println(b);// true

    代码2:

            System.out.println(Pattern.matches("a*b", "aaaaab"));// true

    查阅matcher和matches方法可知matches自动做了一些处理,代码2可视为代码1的简化,他们是等效的。

    如果要多次使用一种模式,编译一次后重用此模式比每次都调用此方法效率更高。

    3、public String[] split(CharSequence input) 和 public String[] split(CharSequence input, int limit)

    input:要拆分的字符序列;

    limit:结果阈值;

    根据指定模式拆分输入序列。

    limit参数作用:

    limit参数控制应用模式的次数,从而影响结果数组的长度。

    如果 n 大于零,那么模式至多应用 n- 1 次,数组的长度不大于 n,并且数组的最后条目将包含除最后的匹配定界符之外的所有输入。

    如果 n 非正,那么将应用模式的次数不受限制,并且数组可以为任意长度。

    如果 n 为零,那么应用模式的次数不受限制,数组可以为任意长度,并且将丢弃尾部空字符串。 

    查看split(CharSequence input) 源码: 

        public String[] split(CharSequence input) {
            return split(input, 0);
        }

    可知split(CharSequence input)实际调用了split(CharSequence input, int limit);以下只讨论split(CharSequence input, int limit)。

    假设:

    若input="boo:and:foo",匹配符为"o",可知模式最多可应用4次,数组的长度最大为5

    1、当limit=-2时,应用模式的次数不受限制且数组可以为任意长度;推测模式应用4次,数组的长度为5,数组为{"b","",":and:f","",""};

    2、当limit=2时,模式至多应用1次,数组的长度不大于 2,且第二个元素包含除最后的匹配定界符之外的所有输入;推测模式应用1次,数组的长度为2,数组为{"b","o:and:foo"};

    3、当limit=7时,模式至多应用6次,数组的长度不大于 7;推测模式应用4次,数组的长度为5,数组为{"b","",":and:f","",""};

    4、当limit=0时,应用模式的次数不受限制,数组可以为任意长度,并且将丢弃尾部空字符串;推测模式应用4次,数组的长度为3,数组为{"b","",":and:f"}。

    代码验证:

    复制代码
    public static void main(String[] args) {
            String[] arr = null;
            CharSequence input = "boo:and:foo";
            Pattern p = Pattern.compile("o");
            arr = p.split(input, -2);
            System.out.println(printArr(arr));// {"b","",":and:f","",""},共有5个元素
            arr = p.split(input, 2);
            System.out.println(printArr(arr));// {"b","o:and:foo"},共有2个元素
            arr = p.split(input, 7);
            System.out.println(printArr(arr));// {"b","",":and:f","",""},共有5个元素
            arr = p.split(input, 0);
            System.out.println(printArr(arr));// {"b","",":and:f"},共有3个元素
        }
    
        // 打印String数组
        public static String printArr(String[] arr) {
            int length = arr.length;
            StringBuffer sb = new StringBuffer();
            sb.append("{");
            for (int i = 0; i < length; i++) {
                sb.append("\"").append(arr[i]).append("\"");
                if (i != length - 1)
                    sb.append(",");
            }
            sb.append("}").append(",共有" + length + "个元素");
            return sb.toString();
        }
    复制代码

    输出结果与以上猜测结果一致。

    4、toString()pattern()

    两个方法代码一样,都是返回此模式的字符串表示形式

       public String toString() {
            return pattern;
        }
        public String pattern() {
            return pattern;
        }

    测试:

    Pattern p = Pattern.compile("\\d+");
    System.out.println(p.toString());// 输出\d+
    System.out.println(p.pattern());// 输出\d+

    5、public int flags()

    返回此模式的匹配标志。

        public int flags() {
            return flags;
        }

    测试:

    Pattern p = Pattern.compile("a+", Pattern.CASE_INSENSITIVE);
    System.out.println(p.flags());// 2

    查阅Pattern源代码:

    public static final int CASE_INSENSITIVE = 0x02;

    可知CASE_INSENSITIVE =2;所以测试输出2。


    展开全文
  • Pattern和Matcher用法

    万次阅读 多人点赞 2018-09-16 10:54:48
    Pattern pattern = Pattern .compile ( "Java" ) ; System .out .println (pattern .pattern ()) ;//返回此模式的正则表达式即Java 1 2 Pattern类还有两个根据匹配模式拆分输入序列的方法:split(CharSequence ...
    版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/woaigaolaoshi/article/details/50970527
                                            <div class="markdown_views">
                            <!-- flowchart 箭头图标 勿删 -->
                            <svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><path stroke-linecap="round" d="M5,0 0,2.5 5,5z" id="raphael-marker-block" style="-webkit-tap-highlight-color: rgba(0, 0, 0, 0);"></path></svg>
                            <p>Java正则表达式通过java.util.regex包下的Pattern和Matcher类实现</p>
    

    Pattern类用于创建一个正则表达式,也可以说是创建一个匹配模式,可以通过两个静态方法创建:compile(String regex)和compile(String regex,int flags),其中regex是正则表达式,flags为可选模式(如:Pattern.CASE_INSENSITIVE 忽略大小写)

    实例:

    Pattern pattern = Pattern.compile("Java");
    System.out.println(pattern.pattern());//返回此模式的正则表达式即Java
      
    • 1
    • 2

    Pattern类还有两个根据匹配模式拆分输入序列的方法:split(CharSequence input) 和split(CharSequence input, int limit),其中limit为返回元素的个数。

    实例:

    Pattern pattern = Pattern.compile("Java");
    String test="123Java456Java789Java";
    String[] result = pattern.split(test);
    for(String s : result)
        System.out.println(s);
      
    • 1
    • 2
    • 3
    • 4
    • 5

    结果:

    这里写图片描述

    在细说一下split(CharSequence input, int limit),当limit值大于所能返回的字符串的最多个数或者为负数,返回的字符串个数将不受限制,但结尾可能包含空串,而当limit=0时与split(CharSequence input)等价,但结尾的空串会被丢弃。

    Pattern pattern = Pattern.compile("Java");
    String test = "123Java456Java789Java";
    
    String[] result = pattern.split(test,2);
    for(String s : result)
                System.out.println(s);
    
    result = pattern.split(test,10);
    System.out.println(result.length);
    
    result = pattern.split(test,-2);
    System.out.println(result.length);
    
    result = pattern.split(test,0);
    System.out.println(result.length);
      
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15

    运行结果:
    这里写图片描述

    Pattern类也自带一个静态匹配方法matches(String regex, CharSequence input),但只能进行全字符串匹配并且只能返回是否匹配上的boolean值

    实例:

    String test1 = "Java";
    String test2 = "Java123456";
    
    System.out.println(Pattern.matches("Java",test1));//返回true
    System.out.println(Pattern.matches("Java",test2));//返回false
      
    • 1
    • 2
    • 3
    • 4
    • 5

    最后就要过渡到Matcher类了,Pattern类中的matcher(CharSequence input)会返回一个Matcher对象。

    Matcher类提供了对正则表达式的分组支持,以及对正则表达式的多次匹配支持,要想得到更丰富的正则匹配操作,那就需要将Pattern与Matcher联合使用。

    实例:

    Pattern pattern = Pattern.compile("Java");
    String test = "123Java456Java789Java";
    Matcher matcher = pattern.matcher();
      
    • 1
    • 2
    • 3

    Matcher类提供了三个返回boolean值得匹配方法:matches(),lookingAt(),find(),find(int start),其中matches()用于全字符串匹配,lookingAt从字符串最开头开始匹配满足的子串,find可以对任意位置字符串匹配,其中start为起始查找索引值。

    实例

    Pattern pattern = Pattern.compile("Java");
    String test1 = "Java";
    String test2 = "Java1234";
    String test3 = "1234Java"
    Matcher matcher = pattern.matcher(test1);
    System.out.println(matcher.matches());//返回true
    matcher = pattern.matcher(test2);
    System.out.println(matcher.matches());//返回false
    
    matcher = pattern.matcher(test2);
    System.out.println(matcher.lookingAt())://返回true
    matcher = pattern.matcher(test3);
    System.out.println(matcher.lookingAt());//返回false
    
    matcher = pattern.matcher(test1);
    System.out.println(matcher.find());//返回true
    matcher = pattern.matcher(test2);
    System.out.println(matcher.find());//返回true
    matcher = pattern.matcher(test3);
    System.out.println(matcher.find(2));//返回true
    matcher = pattern.matcher(test3);
    System.out.println(matcher.find(5));//返回false
    
      
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23

    这里介绍下组的概念:组是用括号划分的正则表达式,可以根据组的编号来引用这个组。组号为0表示整个表达式,组号为1表示被第一对括号括起的组,依次类推,例如A(B(C))D,组0是ABCD,组1是BC,组2是C。

    Matcher类提供了start(),end(),group()分别用于返回字符串的起始索引,结束索引,以及匹配到到的字符串。

    实例:

    Pattern pattern = Pattern.compile("Java");
    String test = "123Java456";
    
    Matcher matcher = pattern.matcher(test);
    matcher.find();
    System.out.println(matcher.start());//返回3
    System.out.println(matcher.end());//返回7
    System.out.println(matcher.group());//返回Java
      
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    Matcher类提供了start(int gropu),end(int group),group(int i),groupCount()用于分组操作

    实例:

    Pattern pattern = Pattern.compile("(Java)(Python)");
    String test = "123JavaPython456";
    Matcher matcher = pattern.matcher(test);
    matcher.find();
    System.out.println(matcher.groupCount());//返回2
    
    System.out.println(matcher.group(1));//返回第一组匹配到的字符串"Java",注意起始索引是1
    System.out.println(matcher.start(1));//返回3,第一组起始索引
    System.out.println(matcher.end(1));//返回7 第一组结束索引
    
    System.out.println(matcher.group(2));//返回第二组匹配到的字符串"Python"
    System.out.println(matcher.start(2));//返回7,第二组起始索引
    System.out.println(matcher.end(2));//返回13 第二组结束索引
      
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13

    Matcher类还提供region(int start, int end)(不包括end)方法用于设定查找范围,并提供regionStrat()和regionEnd()用于返回起始和结束查找的索引

    Pattern pattern = Pattern.compile("Java");
            String test = "123JavaJava";
            Matcher matcher = pattern.matcher(test);
            matcher.region(7, 11);
            System.out.println(matcher.regionStart());//返回7
            System.out.println(matcher.regionEnd());//返回11
            matcher.find();
            System.out.println(matcher.group());//返回Java
      
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8

    Matcher类提供了两种用于重置当前匹配器的方法:reset()和reset(CharSequence input)

    Pattern pattern = Pattern.compile("Java");
            String test = "Java";
            Matcher matcher = pattern.matcher(test);
    
            matcher.find();
            System.out.println(matcher.group());//返回Java
    
            matcher.reset();//从起始位置重新匹配
    
            matcher.find();
            System.out.println(matcher.group());//返回Java
    
            matcher.reset("Python");
            System.out.println(matcher.find());//返回false
      
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14

    最后说一下Matcher类的匹配方法:replaceAll(String replacement) 和 replaceFirst(String replacement),其中replaceAll是替换全部匹配到的字符串,而replaceFirst仅仅是替换第一个匹配到的字符串。

    Pattern pattern = Pattern.compile("Java");
            String test = "JavaJava";
            Matcher matcher = pattern.matcher(test);
            System.out.println(matcher.replaceAll("Python"));//返回PythonPython
            System.out.println(matcher.replaceFirst("python"));//返回PythonJava
      
    • 1
    • 2
    • 3
    • 4
    • 5

    还有两个方法appendReplacement(StringBuffer sb, String replacement) 和 appendTail(StringBuffer sb)也很重要,appendReplacement允许直接将匹配的字符串保存在另一个StringBuffer中并且是渐进式匹配,并不是只匹配依次或匹配全部,而appendTail则是将未匹配到的余下的字符串添加到StringBuffer中。

        Pattern pattern = Pattern.compile("Java");
        Matcher matcher = pattern.matcher("Java1234");
    
    
    System<span class="hljs-preprocessor">.out</span><span class="hljs-preprocessor">.println</span>(matcher<span class="hljs-preprocessor">.find</span>())<span class="hljs-comment">;//返回true</span>
    StringBuffer sb = new StringBuffer()<span class="hljs-comment">;</span>
    
    matcher<span class="hljs-preprocessor">.appendReplacement</span>(sb, <span class="hljs-string">"Python"</span>)<span class="hljs-comment">;</span>
    System<span class="hljs-preprocessor">.out</span><span class="hljs-preprocessor">.println</span>(sb)<span class="hljs-comment">;//输出Python</span>
    
    matcher<span class="hljs-preprocessor">.appendTail</span>(sb)<span class="hljs-comment">;</span>
    System<span class="hljs-preprocessor">.out</span><span class="hljs-preprocessor">.println</span>(sb)<span class="hljs-comment">;//输出Python1234</span></code><ul class="pre-numbering" style=""><li style="color: rgb(153, 153, 153);">1</li><li style="color: rgb(153, 153, 153);">2</li><li style="color: rgb(153, 153, 153);">3</li><li style="color: rgb(153, 153, 153);">4</li><li style="color: rgb(153, 153, 153);">5</li><li style="color: rgb(153, 153, 153);">6</li><li style="color: rgb(153, 153, 153);">7</li><li style="color: rgb(153, 153, 153);">8</li><li style="color: rgb(153, 153, 153);">9</li><li style="color: rgb(153, 153, 153);">10</li><li style="color: rgb(153, 153, 153);">11</li></ul></pre>            </div>
    
    展开全文
  • Java正则(1)— Pattern 详解(一)

    千次阅读 2018-04-21 15:48:37
    Pattern 详解; Matcher 详解; 正则表达式语法详解。 接下来先来介绍 Pattern 类。 在Java中,java.util.regex包定义了正则表达式使用到的相关类,其中最主要的两个类为:Pattern、Matcher: Pattern 编译...
  • ( Pattern Recognition and Machine Learning(完整答案).pdf
  • 正则表达式 Pattern 、Matcher

    千次阅读 2018-01-11 13:32:52
    Pattern和Matcher 1.1 相关 在使用正则表达式之前我们需了解Pattern和Matcher,为什么要了解他们呢? 他们用来解决String不能解决的问题,是很实用切强大的正则表达式对象 1.2 类库 他们同属于一个类库包...
  • Pattern学习】概述

    千次阅读 2017-04-27 10:13:59
     Pattern是Python编程语言的一个Web挖掘模块。它具有数据挖掘工具(谷歌,推特和维基百科API,Web爬虫,HTML DOM解析器)、自然语言处理(词性标注、n-gram搜索,情感分析,WordNet),机器学习(向量空间模型,...
  • Java Pattern和Matcher字符匹配详解

    万次阅读 多人点赞 2017-09-02 21:40:33
    Pattern类定义  public final class Pattern extends Object implementsSerializable正则表达式的编译表示形式。用于编译正则表达式后创建一个匹配模式。  指定为字符串的正则表达式必须首先被编译为此类的实例...
  • pattern的数学含义

    千次阅读 2020-11-18 22:28:42
    *A pattern constitutes a set of numbers or objects, in which all the members are related with each other by a specific rule. Pattern is also known as sequence. * ——easycalculation.com Pattern 有一...
  • =pattern)(?!pattern)(?:pattern)的理解 学习JAVA正则表达式时,(?=pattern)(?!pattern)(?:pattern)这类东西看着就挺难理解,而且官方释义更是越看越不懂。查阅资料后,总结如下。 环视 (?=pattern)与(?!...
  • @Pattern的用法

    千次阅读 2019-07-19 15:36:58
    把这个注解加在entity的参数上,可以选择分类也可以默认;...@Pattern(regexp = "\\w+$") private String userName; 在user传参或者controller层中的添加@Validated注解进行调用 举个栗子:public void add...
  • servlet的url-pattern匹配规则简介

    千次阅读 多人点赞 2018-10-06 20:19:14
    url-pattern&gt;/&lt;/url-pattern&gt;,由此引发对url-pattern匹配规则的思考。  第一章 匹配概述 &lt;url-pattern&gt;是我们用Servlet做Web项目时需要经常配置的标签,例: &lt...
  • web.xml中url-pattern的配置详解

    千次阅读 2018-07-05 10:32:09
    目录前言现象源码分析实战例子总结参考资料前言今天研究了一下tomcat上web.xml配置文件中url-pattern的问题。这个问题其实毕业前就困扰着我,当时忙于找工作。 找到工作之后一直忙,也就没时间顾虑这个问题了。 说...
  • Pattern+Classification+(2nd+Edition).pdf

    热门讨论 2011-10-02 15:08:33
    pattern classification is based on pattern recognition,which is a key step in the pattern recognition
  • the classic book of machine learning and pattern recognization by Bishop, who is the leader of AI team of Microsoft Research Cambridge
  • 正则表达式之Pattern

    千次阅读 2019-04-04 16:23:46
    Pattern之正则表达式 pattern是java.util.regex(一个用正则表达式所定制的模式来对字符串进行匹配工作的类库包)中的一个类。一个pattern是一个正则表达式编译后的模式。 一、pattern的方法如下: 1.static ...
  • logback里面pattern详解

    万次阅读 2018-05-08 18:06:46
    这里之所以要在这里单独把logback的pattern拿出来说一说,是因为笔者今天使用mybatis 的plugin机制把update 的sql语句全部保存到一个文件中,然后作为上线脚本!下面是笔者配置的logback.xml的相关代码: &lt;...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 919,725
精华内容 367,890
关键字:

pattern