精华内容
下载资源
问答
  • C - Word frequency counter

    2021-03-31 17:12:03
    分享一个大牛的人工智能教程。零基础!通俗易懂!风趣幽默!希望你也加入到人工智能的队伍中来!... * decreasing order of frequency of occurrence. Precede each word by its count. * * WordFrequencyCo

    分享一个大牛的人工智能教程。零基础!通俗易懂!风趣幽默!希望你也加入到人工智能的队伍中来!请点击http://www.captainbed.net

    /*
     * Write a program that prints the distinct words in its input sorted into 
     * decreasing order of frequency of occurrence. Precede each word by its count.
     *
     * WordFrequencyCounter.c - by FreeMan
     */
    
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    #include <assert.h>
    
    typedef struct WORD
    {
        char *Word;
        size_t Count;
        struct WORD *Left;
        struct WORD *Right;
    } WORD;
    
    /*
      Assumptions: input is on stdin, output to stdout.
    
      Plan: read the words into a tree, keeping a count of how many we have,
            allocate an array big enough to hold Treecount (WORD *)'s
            walk the tree to populate the array.
            qsort the array, based on size.
            printf the array
            free the array
            free the tree
            free tibet (optional)
            free international shipping!
    */
    
    #define SUCCESS                      0
    #define CANNOT_MALLOC_WORDARRAY      1
    #define NO_WORDS_ON_INPUT            2
    #define NO_MEMORY_FOR_WORDNODE       3
    #define NO_MEMORY_FOR_WORD           4
    #define NONALPHA "1234567890 \v\f\n\t\r+=-*/\\,.;:'#~?<>|{}[]`!\"�$%^&()"
    
    int ReadInputToTree(WORD **DestTree, size_t *Treecount, FILE *Input);
    int AddToTree(WORD **DestTree, size_t *Treecount, char *Word);
    int WalkTree(WORD **DestArray, WORD *Word);
    int CompareCounts(const void *vWord1, const void *vWord2);
    int OutputWords(FILE *Dest, size_t Count, WORD **WordArray);
    void FreeTree(WORD *W);
    char *dupstr(char *s);
    
    int main(void)
    {
        int Status = SUCCESS;
        WORD *Words = NULL;
        size_t Treecount = 0;
        WORD **WordArray = NULL;
    
        /* Read the words on stdin into a tree */
        if (SUCCESS == Status)
        {
            Status = ReadInputToTree(&Words, &Treecount, stdin);
        }
    
        /* Sanity check for no sensible input */
        if (SUCCESS == Status)
        {
            if (0 == Treecount)
            {
                Status = NO_WORDS_ON_INPUT;
            }
        }
    
        /* Allocate a sufficiently large array */
        if (SUCCESS == Status)
        {
            WordArray = malloc(Treecount * sizeof * WordArray);
            if (NULL == WordArray)
            {
                Status = CANNOT_MALLOC_WORDARRAY;
            }
        }
    
        /* Walk the tree into the array */
        if (SUCCESS == Status)
        {
            Status = WalkTree(WordArray, Words);
        }
    
        /* Quick sort the array */
        if (SUCCESS == Status)
        {
            qsort(WordArray, Treecount, sizeof * WordArray, CompareCounts);
        }
    
        /* Walk down the WordArray outputting the values */
        if (SUCCESS == Status)
        {
            Status = OutputWords(stdout, Treecount, WordArray);
        }
    
        /* Free the word array */
        if (NULL != WordArray)
        {
            free(WordArray);
            WordArray = NULL;
        }
    
        /* Free the tree memory */
        if (NULL != Words)
        {
            FreeTree(Words);
            Words = NULL;
        }
    
        /* Error report and we are finshed */
        if (SUCCESS != Status)
        {
            fprintf(stderr, "Program failed with code %d\n", Status);
        }
        return (SUCCESS == Status ? EXIT_SUCCESS : EXIT_FAILURE);
    }
    
    void FreeTree(WORD *W)
    {
        if (NULL != W)
        {
            if (NULL != W->Word)
            {
                free(W->Word);
                W->Word = NULL;
            }
            if (NULL != W->Left)
            {
                FreeTree(W->Left);
                W->Left = NULL;
            }
            if (NULL != W->Right)
            {
                FreeTree(W->Right);
                W->Right = NULL;
            }
        }
    }
    
    int AddToTree(WORD **DestTree, size_t *Treecount, char *Word)
    {
        int Status = SUCCESS;
        int CompResult = 0;
    
        /* Safety check */
        assert(NULL != DestTree);
        assert(NULL != Treecount);
        assert(NULL != Word);
    
        /* Ok, either *DestTree is NULL or it isn't (deep huh?) */
        if (NULL == *DestTree)  /* This is the place to add it then */
        {
            *DestTree = malloc(sizeof **DestTree);
            if (NULL == *DestTree)
            {
                /* Horrible - we're out of memory */
                Status = NO_MEMORY_FOR_WORDNODE;
            }
            else
            {
                (*DestTree)->Left = NULL;
                (*DestTree)->Right = NULL;
                (*DestTree)->Count = 1;
                (*DestTree)->Word = dupstr(Word);
                if (NULL == (*DestTree)->Word)
                {
                    /* Even more horrible - we've run out of memory in the middle */
                    Status = NO_MEMORY_FOR_WORD;
                    free(*DestTree);
                    *DestTree = NULL;
                }
                else
                {
                    /* Everything was successful, add one to the tree nodes count */
                    ++ *Treecount;
                }
            }
        }
        else /* We need to make a decision */
        {
            CompResult = strcmp(Word, (*DestTree)->Word);
            if (0 < CompResult)
            {
                Status = AddToTree(&(*DestTree)->Left, Treecount, Word);
            }
            else if (0 > CompResult)
            {
                Status = AddToTree(&(*DestTree)->Left, Treecount, Word);
            }
            else
            {
                /* Add one to the count - this is the same node */
                ++(*DestTree)->Count;
            }
        }
    
        return Status;
    }
    
    int ReadInputToTree(WORD **DestTree, size_t *Treecount, FILE *Input)
    {
        int Status = SUCCESS;
        char Buf[8192] = { 0 };
        char *Word = NULL;
    
        /* Safety check */
        assert(NULL != DestTree);
        assert(NULL != Treecount);
        assert(NULL != Input);
    
        /* For every line */
        while (NULL != fgets(Buf, sizeof Buf, Input))
        {
            /* Strtok the input to get only alpha character words */
            Word = strtok(Buf, NONALPHA);
            while (SUCCESS == Status && NULL != Word)
            {
                /* Deal with this word by adding it to the tree */
                Status = AddToTree(DestTree, Treecount, Word);
    
                /* Next word */
                if (SUCCESS == Status)
                {
                    Word = strtok(NULL, NONALPHA);
                }
            }
        }
    
        return Status;
    }
    
    int WalkTree(WORD **DestArray, WORD *Word)
    {
        int Status = SUCCESS;
        static WORD **Write = NULL;
    
        /* Safety check */
        assert(NULL != Word);
    
        /* Store the starting point if this is the first call */
        if (NULL != DestArray)
        {
            Write = DestArray;
        }
    
        /* Now add this node and it's kids */
        if (NULL != Word)
        {
            *Write = Word;
            ++Write;
            if (NULL != Word->Left)
            {
                Status = WalkTree(NULL, Word->Left);
            }
            if (NULL != Word->Right)
            {
                Status = WalkTree(NULL, Word->Right);
            }
        }
    
        return Status;
    }
    
    /*
     * CompareCounts is called by qsort. This means that it gets pointers to the
     * data items being compared. In this case the data items are pointers too.
     */
    int CompareCounts(const void *vWord1, const void *vWord2)
    {
        int Result = 0;
        WORD *const *Word1 = vWord1;
        WORD *const *Word2 = vWord2;
    
        assert(NULL != vWord1);
        assert(NULL != vWord2);
    
        /* Ensure the result is either 1, 0 or -1 */
        if ((*Word1)->Count < (*Word2)->Count)
        {
            Result = 1;
        }
        else if ((*Word1)->Count > (*Word2)->Count)
        {
            Result = -1;
        }
        else
        {
            Result = 0;
        }
    
        return Result;
    }
    
    int OutputWords(FILE *Dest, size_t Count, WORD **WordArray)
    {
        int Status = SUCCESS;
        size_t Pos = 0;
    
        /* Safety check */
        assert(NULL != Dest);
        assert(NULL != WordArray);
    
        /* Print a header */
        fprintf(Dest, "Total Words : %lu\n", (unsigned long)Count);
    
        /* Print the words in descending order */
        while (SUCCESS == Status && Pos < Count)
        {
            fprintf(Dest, "%10lu %s\n", (unsigned long)WordArray[Pos]->Count, WordArray[Pos]->Word);
            ++Pos;
        }
    
        return Status;
    }
    
    
    /*
     * dupstr: Duplicate a string
     */
    char *dupstr(char *s)
    {
        char *Result = NULL;
        size_t slen = 0;
    
        /* Sanity check */
        assert(NULL != s);
    
        /* Get string length */
        slen = strlen(s);
    
        /* Allocate enough storage */
        Result = malloc(slen + 1);
    
        /* Populate string */
        if (NULL != Result)
        {
            memcpy(Result, s, slen);
            *(Result + slen) = '\0';
        }
    
        return Result;
    }
    
    // Output:
    /*
    These are short, famous texts in English from classic sources like the Bible or Shakespeare. Some texts have word definitions and explanations to help you. Some
     of these texts are written in an old style of English. Try to understand them, because the English that we speak today is based on what our great, great, great
    , great grandparents spoke before! Of course, not all these texts were originally written in English. The Bible, for example, is a translation. But they are all
     well known in English today, and many of them express beautiful thoughts.
    ^Z
    Total Words : 66
             5 English
             4 great
             4 texts
             4 in
             3 are
             3 of
             2 is
             2 today
             2 them
             2 written
             2 all
             2 the
             2 Bible
             2 these
             2 to
             2 Some
             2 and
             1 word
             1 definitions
             1 have
             1 explanations
             1 Shakespeare
             1 help
             1 you
             1 or
             1 like
             1 sources
             1 an
             1 old
             1 style
             1 Try
             1 understand
             1 classic
             1 because
             1 that
             1 we
             1 speak
             1 from
             1 famous
             1 based
             1 on
             1 what
             1 our
             1 short
             1 grandparents
             1 spoke
             1 before
             1 Of
             1 course
             1 not
             1 These
             1 were
             1 originally
             1 The
             1 for
             1 example
             1 a
             1 translation
             1 But
             1 they
             1 well
             1 known
             1 many
             1 express
             1 beautiful
             1 thoughts
    
    */

     

    展开全文
  • <div><p>Python solution for <a href="https://github.com/tmbdev/hocr-tools/wiki/Calculate-word-frequency">calculating word frequency</a></p>该提问来源于开源项目:ocropus/hocr-tools</p></div>
  • #Step 3: build a dictionary with words and frequency info and get it sorted def create_word_dictionary (clean_word_list) : word_count = {} for word in clean_word_list: if word in word_...

    跟着Bucky Roberts 的tutorial写了一个简单的网页词汇频率代码块

    目的:根据所给网页,抓取上面的词汇(这里是英语词汇),并按照词汇出现的频率排序

    步骤:
    1. 创建一个list,将页面上的所有strings放进去
    2. 整理list,去除掉特殊符号
    3. 创建dictionary,将list内容放进去按照词汇出现的频率排序

    需要的模块:requests, BeautifulSoup, operator

    代码块及注释如下:

    import requests
    from bs4 import BeautifulSoup
    import operator
    
    
    url = 'https://www.python.org/events/'
    
    #Step 1: create a list with every word in
    def start(url):
    
        #set up a blank list to store words
        word_list = []
        #get source code from url, pick the content word by word and put in the list
        #internet request, connect the url
        source_code = requests.get(url).text
        #turn into soup object to work with
        soup = BeautifulSoup(source_code,"html.parser")
    
        #inspect the unique element for the content you need
        for post_text in soup.findAll('span',{'class':'event-location'}):
            #lower the content and split the sentences
            content = post_text.string
            words = content.lower().split()
            for each_word in words:
                word_list.append(each_word)
        clean_up_list(word_list)
    
    #Step 2: clean up the list, take out things which are not words
    def clean_up_list(word_list):
        clean_word_list = []
        for word in word_list:
            symbols = '`~!@#$%^&*()-=_+[]\|;/\':""?/,.<>{}'
            for i in range(0,len(symbols)):
                word = word.replace(symbols[i],'')
            if len(word) > 0:
                clean_word_list.append(word)
        create_word_dictionary(clean_word_list)
    
    
    #Step 3: build a dictionary with words and frequency info and get it sorted
    def create_word_dictionary(clean_word_list):
        word_count = {}
        for word in clean_word_list:
            if word in word_count:
                word_count[word] += 1
            else:
                word_count[word] = 1
        for key,value in sorted(word_count.items(), key = operator.itemgetter(1)):
            print (key, value)
    
    start(url)

    这里以python的活动页面为例[其实计算这个页面的词汇频率并没多大价值,只做功能实现用]

    运行结果如下 [还是没法贴图片,只能又通过引用来显示结果了]:

    basel 1
    centre 2
    and 2
    city 2
    hotel 2
    campus 2
    republic 2
    switzerland 2
    new 2
    ossa 2
    convention 2
    ohio 2
    1 3
    singapore 3
    south 3
    usa 3
    computing 3
    germany 5
    the 5
    university 6
    of 9
    Process finished with exit code 0

    结果是按照词汇出现的频率升序排序的。
    key = operator.itemgetter(1) 是按照频率排序,如果code为’0’,则是按照词汇的首字母排序

    for key,value in sorted(word_count.items(), key = operator.itemgetter(1)):

    以上就是简单的代码实现啦,如有疑问还请指出,欢迎讨论。

    参考教程:
    http://www.bilibili.com/video/av2847788/index_35.html
    http://www.bilibili.com/video/av2847788/index_36.html
    http://www.bilibili.com/video/av2847788/index_37.html
    引用页面:
    https://www.python.org/events/

    展开全文
  • /*SET THE CHANNEL AND FREQUENCY OF LORA TRANSCEIVER*/ #define BAND 915E6 //915MHz, E6 channel /*DEFINE THE PINS USED BY THE OLED SCREEN*/ #define OLED_SDA 4 #define OLED_SCL 15 #define OLED_RST 16 #...
  • 三种方法: ①直接使用dict ②使用defaultdict ... if word not in frequency: frequency[word] = 1 else: frequency[word] += 1  ②defaultdict import collections frequency = collections.defaultdict(int)
  • Python用三种方式统计词频

    千次阅读 2018-01-23 12:00:08
    三种方法: ①直接使用dict ②使用defaultdict ③使用Counter ps:`int()`函数默认返回0 ①dict ...frequency = {} ...for word in text.split(): if word not in frequency: frequ

    三种方法:

    ①直接使用dict

    ②使用defaultdict

    ③使用Counter

    ps:`int()`函数默认返回0

    ①dict

    text = "I'm a hand some boy!"
    
    frequency = {}
    
    for word in text.split():
        if word not in frequency:
            frequency[word] = 1
        else:
            frequency[word] += 1
    




    ②defaultdict

    import collections
    
    frequency = collections.defaultdict(int)
    
    text = "I'm a hand some boy!"
    
    for word in text.split():
        frequency[word] += 1
    


    ③Counter

    import collections
    
    text = "I'm a hand some boy!"
    frequency = collections.Counter(text.split())
    展开全文
  • python实现: class Solution: def topKFrequent(self, words: List[str], k: int) -> List[str]: frequency = collections.Counter(words... queue = [(-freq, word)for word,freq in frequency.items()] heapq.

    在这里插入图片描述
    python实现:

    class Solution:
        def topKFrequent(self, words: List[str], k: int) -> List[str]:
            frequency = collections.Counter(words)
            # 大顶堆:最小的在下面,最大的在上面,弹出k个
            queue = [(-freq, word)for word,freq in frequency.items()]
            heapq.heapify(queue)
            return [heapq.heappop(queue)[1] for _ in range(k)]
    

    Java实现:

    class Solution {
        public List<String> topKFrequent(String[] words, int k) {
            Map<String,Integer> count = new HashMap();
            for(String word:words){
                count.put(word,count.getOrDefault(word,0)+1);
            }
            PriorityQueue<String> heap = new PriorityQueue<String>(
                (w1,w2) -> count.get(w1).equals(count.get(w2))? w2.compareTo(w1):count.get(w1)
    -count.get(w2)        );
    
            for(String word:count.keySet()){
                heap.offer(word);
                if(heap.size()>k) heap.poll();
            }
    
            List<String> ans = new ArrayList();
            while (!heap.isEmpty()) ans.add(heap.poll());
            Collections.reverse(ans);
            return ans;
        }
    }
    
    展开全文
  • def frequency(list_word): c = Counter() for x in list_word: if len(x) > 1 and x != '\r\n': c[x] += 1 word = [] key = [] for (k, v) in c.most_common(100): print('%s %d' % (k, v)) word...
  • s important to build a vocab with indices generated based on word frequencies (including \<unk>, \<eos>, etc): - assign indices to special & normal tokens based their frequency. - If ...
  • Once we get to the second if statement, we only have a string (test) comprised of letters that all have a frequency of 1, limiting the words we can utilize. <p>Was this an intentional decision or am...
  • Instruction’s length is one word. Green mode: Periodical wakeup by timer Most of instructions are one cycle only. All ROM area JMP/CALL instruction.  Package (Chip form support) All ROM area...
  • font issue with pygame

    2020-12-29 21:43:49
    <p>....[some stuff here that creates a list of word frequency tuples from my csv data] <p>word_frequencies = [] for n,irecord in enumerate(sorted(word_freqs.items(), key=lambda item: item[1]))...
  • - parallel-letter-frequency - hangman - variable-length-quantity - alphametics - custom-set - diffie-hellman - change - protein-translation - counter - lens-person - go-counting - word-search - ...
  • parallel-letter-frequency hamming sum-of-multiples pythagorean-triplet bank-account crypto-square luhn largest-series-product sieve palindrome-products bracket-push anagram word-count ...
  • degital electronics

    2010-02-03 21:30:31
    10.9.6 Maximum Clock Frequency 402 10.10 Flip-Flop Applications 402 10.10.1 Switch Debouncing 402 10.10.2 Flip-Flop Synchronization 404 10.10.3 Detecting the Sequence of Edges 404 10.11 Application-...
  • Calculate word frequency in a block of text 1-Beginner Weather App Get the temperature, weather condition of a city. 1-Beginner Tier-2: Intermediate Projects Name Short Description Tier ...
  • num-checkpoint-not-improved 32 --batch-size 4000 --batch-type word --rnn-attention-type mlp --rnn-dropout-inputs 0.1 --rnn-decoder-hidden-dropout 0.2 --use-tensorboard --checkpoint-frequency 4000 --...
  • <div><p>Hello, <p>I am trying to do a SDOwrite during pre-op to 0x1c13:01 with a value of 0x1a01 but I get error:08000021 Data cannot be transferred or stored to the application because of local ...
  • Ultra-simplified explanation to design patterns! A topic that can easily make anyone's mind wobble. Here I try to make them stick in to your mind (and maybe mine) by explaining them in the simplest...
  • ft_max_word_len 84 ft_min_word_len 4 ft_query_expansion_limit 20 ft_stopword_file (built-in) general_log OFF general_log_file db149.log group_concat_max_len 1024 have_compress YES have_crypt YES have_...
  • i486DX2 processors with internal frequency doubling. The i486DX4. Other 486 CPUs. 10. The Pentium. Pins and signals. The internal structure of the Pentium. The integer pipelines u and v. ...
  • time stamp counter = true RDMSR and WRMSR support = true physical address extensions = true machine check exception = true CMPXCHG8B inst. = true APIC on chip = true ...
  • civetweb issues

    2021-01-10 05:30:10
    <div><p>I'm playing with <code>civetweb</code> webserver module on latest zephyr tree and I see some issues here <h2>stack size <p>When I try to serve a bit bigger pages that are in provided in ...
  • ft_min_word_len','4' 'ft_query_expansion_limit','20' 'ft_stopword_file','(built-in)' 'general_log','OFF' 'general_log_file','/...
  • python programming

    2009-01-17 08:52:35
    4.4.3 Better Change Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.5 File Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...
  • cc2530_user_guide

    2016-04-21 16:17:14
    2.3.3 Program Status Word............................................................................................. 35 2.3.4 Accumulator................................................................
  • <div><p>I'...to something higher than 6 and try to flood my board with many requests then some tx buffers are lost and never freed. <p>For network testing I use wrk benchmarking tool with command: ...

空空如也

空空如也

1 2
收藏数 29
精华内容 11
关键字:

counterfrequencyword