• Department of Information Management, National Central University, Jhongli 32001, Taiwan Received 26 August 2012; Accepted 19 September 2012 Academic Editors: F. Camastra, J. A


Chih-Fong Tsai

Department of Information Management, National Central University, Jhongli 32001, Taiwan
Received 26 August 2012; Accepted 19 September 2012
Academic Editors: F. Camastra, J. A. Hernandez, P. Kokol, J. Wang, and S. Zhu

Copyright © 2012 Chih-Fong Tsai. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract
Content-based image retrieval (CBIR) systems require users to query images by their low-level visual content; this not only makes it hard for users to formulate queries, but also can lead to unsatisfied retrieval results. To this end, image annotation was proposed. The aim of image annotation is to automatically assign keywords to images, so image retrieval users are able to query images by keywords. Image annotation can be regarded as the image classification problem: that images are represented by some low-level features and some supervised learning techniques are used to learn the mapping between low-level features and high-level concepts (i.e., class labels). One of the most widely used feature representation methods is bag-of-words (BoW). This paper reviews related works based on the issues of improving and/or applying BoW for image annotation. Moreover, many recent works (from 2006 to 2012) are compared in terms of the methodology of BoW feature generation and experimental design. In addition, several different issues in using BoW are discussed, and some important issues for future research are discussed.

1. Introduction
Advances in computer and multimedia technologies allow for the production of digital images and large repositories for image storage with little cost. This has led to the rapid increase in the size of image collections, including digital libraries, medical imaging, art and museum, journalism, advertising and home photo archives, and so forth. As a result, it is necessary to design image retrieval systems which can operate on a large scale. The main goal is to create, manage, and query image databases in an efficient and effective, that is, accurate manner.
Content-based image retrieval (CBIR), which was proposed in the early 1990s, is a technique to automatically index images by extracting their (low-level) visual features, such as color, texture, and shape, and the retrieval of images is based solely upon the indexed image features [1–3]. Therefore, it is hypothesized that relevant images can be retrieved by calculating the similarity between the low-level image contents through browsing, navigation, query-by-example, and so forth. Typically, images are represented as points in a high dimensional feature space. Then, a metric is used to measure similarity or dissimilarity between images on this space. Thus, images close to the query are similar to the query and retrieved. Although CBIR introduced automated image feature extraction and indexation, it does not overcome the so-called semantic gap described below.
The semantic gap is the gap between the extracted and indexed low-level features by computers and the high-level concepts (or semantics) of user’s queries. That is, the automated CBIR systems cannot be readily matched to the users’ requests. The notation of similarity in the user’s mind is typically based on high-level abstractions, such as activities, entities/objects, events, or some evoked emotions, among others. Therefore, retrieval by similarity using low-level features like color or shape will not be very effective. In other words, human similarity judgments do not obey the requirements of the similarity metric used in CBIR systems. In addition, general users usually find it difficult to search or query images by using color, texture, and/or shape features directly. They usually prefer textual or keyword-based queries, since they are easier and more intuitive for representing their information needs [4–6]. However, it is very challenging to make computers capable of understanding or extracting high-level concepts from images as humans do.
Consequently, the semantic gap problem has been approached by automatic image annotation. In automatic image annotation, computers are able to learn which low-level features correspond to which high-level concepts. Specifically, the aim of image annotation is to make the computers extract meanings from the low-level features by a learning process based on a given set of training data which includes pairs of low-level features and their corresponding concepts. Then, the computers can assign the learned keywords to images automatically. For the review of image annotation, please refer to Tsai and Hung [7], Hanbury [8], and Zhang et al. [9].
Image annotation can be defined as the process of automatically assigning keywords to images. It can be regarded as an automatic classification of images by labeling images into one of a number of predefined classes or categories, where classes have assigned keywords or labels which can describe the conceptual content of images in that class. Therefore, the image annotation problem can be thought of as image classification or categorization.
More specifically, image classification can be divided into object categorization [10] and scene classification. For example, object categorization focuses on classifying images into “concrete” categories, such as “agate”, “car”, “dog”, and so on. On the other hand, scene classification can be regarded as abstract keyword based image annotation [11, 12], where scene categories are such as “harbor”, “building”, and “sunset”, which can be regarded as an assemblage of multiple physical or entity objects as a single entity. The difference between object recognition/categorization and scene classification was defined by Quelhas et al. [13].
However, image annotation performance is heavily dependent on image feature representation. Recently, the bag-of-words (BoW) or bag-of-visual-words model, a well-known and popular feature representation method for document representation in information retrieval, was first applied to the field of image and video retrieval by Sivic and Zisserman [14]. Moreover, BoW has generally shown promising performance for image annotation and retrieval tasks [15–22].
The BoW feature is usually based on tokenizing keypoint-based features, for example, scale-invariant feature transform (SIFT) [23], to generate a visual-word vocabulary (or codebook). Then, the visual-word vector of an image contains the presence or absence information of each visual word in the image, for example, the number of keypoints in the corresponding cluster, that is, visual word.
Since 2003, BoW has been used extensively in image annotation, but there has not as yet been any comprehensive review of this topic. Therefore, the aim of this paper is to review the work of using BoW for image annotation from 2006 to 2012.
The rest of this paper is organized as follows. Section 2 describes the process of extracting the BoW feature for image representation and annotation. Section 3 discusses some important extension studies of BoW, including the improvement of BoW per se and its application to other related research problems. Section 4 provides some comparisons of related work in terms of the methodology of constructing the BoW feature, including the detection method, the clustering algorithm, the number of visual words, and so forth and the experimental set up including the datasets used, the number of object or scene categories, and so forth. Finally, Section 5concludes the paper.
2. Bag-of-Words Representation
The bag-of-words (BoW) methodology was first proposed in the text retrieval domain problem for text document analysis, and it was further adapted for computer vision applications [24]. For image analysis, a visual analogue of a word is used in the BoW model, which is based on the vector quantization process by clustering low-level visual features of local regions or points, such as color, texture, and so forth.
To extract the BoW feature from images involves the following steps: (i) automatically detect regions/points of interest, (ii) compute local descriptors over those regions/points, (iii) quantize the descriptors into words to form the visual vocabulary, and (iv) find the occurrences in the image of each specific word in the vocabulary for constructing the BoW feature (or a histogram of word frequencies) [24]. Figure 1 describes these four steps to extract the BoW feature from images.

Figure 1: Four steps for constructing the bag-of-words for image representation.

The BoW model can be defined as follows. Given a training dataset  containing  images represented by , and , where  is the extracted visual features, a specific unsupervised learning algorithm, such as -means, is used to group  based on a fixed number of visual words  (or categories) represented by , and , where  is the cluster number. Then, we can summarize the data in a cooccurrence table of counts , where  denotes how often the word  occurred in an image .
2.1. Interest Point Detection
The first step of the BoW methodology is to detect local interest regions or points. For feature extraction of interest points (or keypoints), they are computed at predefined locations and scales. Several well-known region detectors that have been described in the literature are discussed below [25, 26].(i)Harris-Laplace regions are detected by the scale-adapted Harris function and selected in scale-space by the Laplacian-of-Gaussian operator. Harris-Laplace detects corner-like structures.(ii)DoG regions are localized at local scale-space maxima of the difference-of-Gaussian. This detector is suitable for finding blob-like structures. In addition, the DoG point detector has previously been shown to perform well, and it is also faster and more compact (less feature points per image) than other detectors.(iii)Hessian-Laplace regions are localized in space at the local maxima of the Hessian determinant and in scale at the local maxima of the Laplacian-of-Gaussian.(iv)Salient regions are detected in scale-space at local maxima of the entropy. The entropy of pixel intensity histograms is measured for circular regions of various sizes at each image position.(v)Maximally stable extremal regions (MSERs) are components of connected pixels in a thresholded image. A watershed-like segmentation algorithm is applied to image intensities and segment boundaries which are stable over a wide range of thresholds that define the region.
In Mikolajczyk et al. [27], they compare six types of well-known detectors, which are detectors based on affine normalization around Harris and Hessian points, MSER, an edge-based region detector, a detector based on intensity extrema, and a detector of salient regions. They conclude that the Hessian-Affine detector performs best.
On the other hand, according to Hörster and Lienhart [21], interest points can be detected by the sparse or dense approach. For sparse features, interest points are detected at local extremas in the difference of a Gaussian pyramid [23]. A position and scale are automatically assigned to each point and thus the extracted regions are invariant to these properties. For dense features, on the other hand, interest points are defined at evenly sampled grid points. Feature vectors are then computed based on three different neighborhood sizes, that is, at different scales, around each interest point.
Some authors believe that a very precise segmentation of an image is not required for the scene classification problem [28], and some studies have shown that coarse segmentation is very suitable for scene recognition. In particular, Bosch et al. [29] compare four dense descriptors with the widely used sparse descriptor (i.e., the Harris detector) [14, 15] and show that the best results are obtained with the dense descriptors. This is because there is more information on scene images, and intuitively a dense image description is necessary to capture uniform regions such as sky, calm water, or road surface in many natural scenes. Similarly, Jurie and Triggs [30] show that the sampling of many patches on a regular dense grid (or a fixed number of patches) outperforms the use of interest points. In addition, Fei-Fei  and Perona [31], and Bosch et al. [29] show that dense descriptors outperform the sparse ones.
2.2. Local Descriptors
In most studies, some single local descriptors are extracted, in which the Scale Invariant Feature Transform (SIFT) descriptor is the most widely extracted [23]. It combines a scale invariant region detector and a descriptor based on the gradient distribution in the detected regions. The descriptor is represented by a 3D histogram of gradient locations and orientations. The dimensionality of the SIFT descriptor is 128.
In order to reduce the dimensionality of the SIFT descriptor, which is usually 128 dimensions per keypoint, principal component analysis (PCA) can be used for increasing image retrieval accuracy and faster matching [32]. Specifically, Uijlings et al. [33] show that retrieval performance can be increased by using PCA for the removal of redundancy in the dimensions.
SIFT was found to work best [13, 25, 34, 35]. Specifically, Mikolajczyk and Schmid [34] compared 10 different descriptors extracted by the Harris-Affine detector, which are SIFT, gradient location and orientation histograms (GLOH) (i.e., an extension of SIFT), shape context, PCA-SIFT, spin images, steerable filters, differential invariants, complex filters, moment invariants, and cross-correlation of sampled pixel values. They show that the SIFT-based descriptors perform best.
In addition, Quelhas et al. [13] confirm in practice that DoG + SIFT constitutes a reasonable choice. Very few consider the extraction of different descriptors. For example, Li et al. [36] combine or fuse the SIFT descriptor and the concatenation of block and blob based HSV histogram and local binary patterns to generate the BoW.
2.3. Visual Word Generation/Vector Quantization
When the keypoints are detected and their features are extracted, such as with the SIFT descriptor, the final step of extracting the BoW feature from images is based on vector quantization. In general, the -means clustering algorithm is used for this task, and the number of visual words generated is based on the number of clusters (i.e., ). Jiang et al. [17] conducted a comprehensive study on the representation choices of BoW, including vocabulary size, weighting scheme, such as binary, term frequency (TF) and term frequency-inverse document frequency (TF-IDF), stop word removal, feature selection, and so forth for video and image annotation.
To generate visual words, many studies focus on capturing spatial information in order to improve the limitations of the conventional BoW model, such as Yang et al. [37], Zhang et al. [38], Chen et al. [39], S. Kim and D. Kim [40], Lu and Ip [41], Lu and Ip [42], Uijlings et al. [43], Cao and Fei-Fei [44], Philbin et al. [45], Wu et al. [46], Agarwal and Triggs [47], Lazebnik et al. [48], Marszałek and Schmid [49], and Monay et al. [50], in which spatial pyramid matching introduced by Lazebnik et al. [48] has been widely compared as one of the baselines.
However, Van de Sande et al. [51] have shown that the severe drawback of the bag-of-words model is its high computational cost in the quantization step. In other words, the most expensive part in a state-of-the-art setup of the bag-of-words model is the vector quantization step, that is, finding the closest cluster for each data point in the -means algorithm.
Uijlings et al. [33] compare -means and random forests for the word assignment task in terms of computational efficiency. By using different descriptors with different grid sizes, random forests are significantly faster than -means. In addition, using random forests to generate BoW can provide a slightly better Mean Average Precision (MAP) than -means does. They also recommend two BoW pipelines when the focuses are on accuracy and speed, respectively.
In their seminal work, Philbin et al. [45], the approximate -means, hierarchical -means, and (exact) -means are compared in terms of the precision performance and computational cost, where approximate -means works best. (See Section 4.3 for further discussion).
Chum et al. [52] observe that feature detection and quantization are noisy processes and this can result in variation in the particular visual words that appear in different images of the same object, leading to missed results.
2.4. Learning Models
After the BoW feature is extracted from images, it is entered into a classifier for training or testing. Besides constructing the discriminative models as classifiers for image annotation, some Bayesian text models by Latent Semantic Analysis [53], such as probabilistic Latent Semantic Analysis (pLSA) [54] and Latent Dirichlet Analysis (LDA) [55] can be adapted to model object and scene categories.
2.4.1. Discriminative Models
The construction of discriminative models for image annotation is based on the supervised machine learning principle for pattern recognition. Supervised learning can be thought as learning by examples or learning with a teacher [56]. The teacher has knowledge of the environment which is represented by a set of input-output examples. In order to classify unknown patterns, a certain number of training samples are available for each class, and they are used to train the classifier [57].
The learning task is to compute a classifier or model  that approximates the mapping between the input-output examples and correctly labels the training set with some level of accuracy. This can be called thetraining or model generation stage. After the model  is generated or trained, it is able to classify an unknown instance, into one of the learned class labels in the training set. More specifically, the classifier calculates the similarity of all trained classes and assigns the unlabeled instance to the class with the highest similarity measure. More specifically, the most widely developed classifier is based on support vector machines (SVM) [58].
2.4.2. Generative Models
In text analysis, pLSA and LDA are used to discover topics in a document using the BoW document representation. For image annotation, documents and discovered topics are thought of as images and object categories, respectively. Therefore, an image containing instances of several objects is modeled as a mixture of topics. This topic distribution over the images is used to classify an image as belonging to a certain scene. For example, if an image contains “water with waves”, “sky with clouds”, and “sand”, it will be classified into the “coast” scene class [24].
Following the previous definition of BoW, in pLSA there is a latent variable model for cooccurrence data which associates an unobserved class variable  with each observation. A joint probability model  over  is defined by the mixture:where  are the topic specific distributions and each image is modeled as a mixture of topics, .
On the other hand, LDA treats the multinomial weights  over topics as latent random variables. In particular, the pLSA model is extended by sampling those weights from a Dirichlet distribution. This extension allows the model to assign probabilities to data outside the training corpus and uses fewer parameters, which can reduce the overfitting problem.
The goal of LDA is to maximize the following likelihood:where  and  are multinomial parameters over the topics and words, respectively, and  and are Dirichlet distributions parameterized by the hyperparameters  and .
Bosch et al. [24] compare BoW + pLSA with different semantic modeling approaches, such as the traditional global based feature representation, block-based feature representation [59] with the -nearest neighbor classifier. They show that BoW + pLSA performs best. Specifically, the HIS histogram + cooccurrence matrices + edge direction histogram are used as the image descriptors.
However, it is interesting that Lu and Ip [41] and Quelhas et al. [60] show that pLSA does not perform better than BoW + SVM over the Corel dataset, where the former uses blocked based HSV and Gabor texture features and the latter uses keypoint based SIFT features.
3. Extensions of BoW
This section reviews the literature regarding using BoW for some related problems. They are divided into five categories, namely, feature representation, vector quantization, visual vocabulary construction, image segmentation, and others.
3.1. Feature Representation
Since the annotation accuracy is heavily dependent on feature representation, using different region/point descriptors and/or the BoW feature representation will provide different levels of discriminative power for annotation. For example, Mikolajczyk and Schmid [34] compare 10 different local descriptors for object recognition. Jiang et al. [17] examine the classification accuracy of the BoW features using different numbers of visual words and different weighting schemes.
Due to the drawbacks that vector quantization may reduce the discriminative power of images and the BoW methodology ignores geometric relationships among visual words, Zhong et al. [61] present a novel scheme where SIFT features are bundled into local groups. These bundled features are repeatable and are much more discriminative than an individual SIFT feature. In other words, a bundled feature provides a flexible representation that allows us to partially match two groups of SIFT features.
On the other hand, since the image feature generally carries mixed information of the entire image which may contain multiple objects and background, the annotation accuracy can be degraded by such noisy (or diluted) feature representations. Chen et al. [62] propose a novel feature representation, pseudo-objects. It is based on a subset of proximate feature points with its own feature vector to represent a local area to approximate candidate objects in images.
Gehler and Nowozin [63] focus on feature combination, which is to combine multiple complementary features based on different aspects such as shape, color, or texture. They study several models that aim at learning the correct weighting of different features from training data. They provide insight into when combination methods can be expected to work and how the benefit of complementary features can be exploited most efficiently.
Qin and Yung [64] use localized maximum-margin learning to fuse different types of features during the BoW modeling. Particularly, the region of interest is described by a linear combination of the dominant feature and other features extracted from each patch at different scales, respectively. Then, dominant feature clustering is performed to create contextual visual words, and each image in the training set is evaluated against the codebook using the localized maximum-margin learning method to fuse other features, in order to select a list of contextual visual words that best represents the patches of the image.
As there is a relation between the composition of a photograph and its subject, similar subjects are typically photographed in a similar style. Van Gemert [65] exploits the assumption that images within a category share a similar style, such as colorfulness, lighting, depth of field, viewpoints and saliency. They use the photographic style for category-level image classification. In particular, where the spatial pyramid groups features spatially [48], they focus on more general feature grouping, including these photographic style attributes.
In Rasiwasia and Vasconcelos [66], they introduce an intermediate space, based on a low dimensional semantic “theme” image representation, which is learned with weak supervision from casual image annotations. Each theme induces a probability density on the space of low-level features, and images are represented as vectors of posterior theme probabilities.
3.2. Vector Quantization
In order to reduce the quantization noise, Jégou et al. [67] construct short codes using quantization. The goal is to estimate distances using vector-to-centroid distances, that is, the query vector is not quantized, codes are assigned to the database vectors only. In other words, the feature space is decomposed into a Cartesian product of low-dimensional subspaces, and then each subspace is quantized separately. In particular, a vector is represented by a short code composed of its subspace quantization indices.
As abrupt quantization into discrete bins does cause some aliasing, Agarwal and Triggs [47] focus on soft vector quantization, that is, softly voting into the cluster centers that lie close to the patch, for example, with Gaussian weights. They show that diagonal-covariance Gaussian mixtures fitted using expectation-maximization performs better than hard vector quantization.
Similarly, Fernando et al. [68] propose a supervised learning algorithm based on a Gaussian mixture model, which not only generalizes the -means by allowing “soft assignments”, but also exploits supervised information to improve the discriminative power of the clusters. In their approach, an EM-based approach is used to optimize a convex combination of two criteria, in which the first one is unsupervised and based on the likelihood of the training data, and the second is supervised and takes into account the purity of the clusters.
On the other hand, Wu et al. [69] propose a Semantics-Preserving Bag-of-Words (SPBoW) model, which considers the distance between the semantically identical features as a measurement of the semantic gap and tries to learn a codebook by minimizing this semantic gap. That is, the codebook generation task is formulated as a distance metric learning problem. In addition, one visual feature can be assigned to multiple visual words in different object categories.
In de Campos et al. [70], images are modeled as order-less sets of weighted visual features where each visual feature is associated with a weight factor that may inform re its relevance. In this approach, visual saliency maps are used to determine the relevance weight of a feature.
Zheng et al. [71] argue that for the BoW model used in information retrieval and document categorization, the textual word possesses semantics itself and the documents are well-structured data regulated by grammar, linguistic, and lexicon rules. However, there appears to be no well-defined rules in the visual word composition of images. For instance, the objects of the same class might have arbitrarily different shapes and visual appearances, while objects of different classes might share similar local appearances. To this end, a higher-level visual representation, visual synset for object recognition is presented. First, an intermediate visual descriptor, delta visual phrase, is constructed from a frequently co-occurring visual word-set with similar spatial context. Second, the delta visual phrases are clustered into a visual synset based their probabilistic “semantics”, that is, class probability distribution.
Besides reducing the vector quantization noise, another severe drawback of the BoW model is its high computational cost. To address this problem, Moosmann et al. [72] introduce extremely randomized clustering forests based on ensembles of randomly created clustering trees and show that more accurate results can be obtained as well as much faster training and testing.
Recently, Van de Sande et al. [51] proposed two algorithms to combine GPU hardware and a parallel programming model to accelerate the quantization and classification components of the visual categorization architecture.
On the other hand, Hare et al. [73] show the intensity inversion characteristics of the SIFT descriptor and local interest region detectors can be exploited to decrease the time it takes to create vocabularies of visual terms. In particular, they show that clustering inverted and noninverted (or minimum and maximum) features separate results in the same retrieval performance when compared to the clustering of all the features as a single set (with the same overall vocabulary size).
3.3. Visual Vocabulary Construction
Since related studies, such as Jegou et al. [74], Marszałek and Schmid [49], Sivic and Zisserman [14], and Winn et al. [75], have shown that the commonly generated visual words are still not as expressive as text words, in Zhang et al. [76], images are represented as visual documents composed of repeatable and distinctive visual elements, which are comparable to text words. They propose descriptive visual words (DVWs) and descriptive visual phrases (DVPs) as the visual correspondences to text words and phrases, where visual phrases refer to the frequently co-occurring visual word pairs.
Gavves et al. [77] focus on identifying pairs of independent, distant words—the visual synonyms—that are likely to host image patches of similar visual reality. Specifically, landmark images are considered, where the image geometry guides the detection of synonym pairs. Image geometry is used to find those image features that lie in a nearly identical physical location, yet are assigned to different words of the visual vocabulary.
On the other hand, López-Sastre et al. [78] present a novel method for constructing a visual vocabulary that takes into account the class labels of images. It consists of two stages: Cluster Precision Maximisation (CPM) and Adaptive Refinement. In the first stage, a Reciprocal Nearest Neighbours (RNN) clustering algorithm is guided towards class representative visual words by maximizing a new cluster precision criterion. Next, an adaptive threshold refinement scheme is proposed with the aim of increasing vocabulary compactness, while at the same time improving the recognition rate and further increasing the representativeness of the visual words for category-level object recognition. In other words, this is a correlation clustering based approach, which works as a kind of metaclustering and optimizes the cut-off threshold for each cluster separately.
Constructing visual codebook ensembles is another approach to improve image annotation accuracy. In Luo et al. [18], three methods for constructing visual codebook ensembles are presented. The first one is based on diverse individual visual codebooks by randomly choosing interesting points. The second one uses a random subtraining image dataset with random interesting points. The third one directly utilizes different patch information for constructing an ensemble with high diversity. Consequently, different types of image presentations are obtained. Then, a classification ensemble is learned by the different expression datasets from the same training set.
Bae and Juang [79] apply the idea of linguistic parsing to generate the BoW feature for image annotation. That is, images are represented by a number of variable-size patches by a multidimensional incremental parsing algorithm. Then, the occurrence pattern of these parsed visual patches is fed into the LSA framework.
Since one major challenge in object categorization is to find class models that are “invariant” enough to incorporate naturally-occurring intraclass variations and yet “discriminative” enough to distinguish between different classes, Winn et al. [75] proposed a supervised learning algorithm, which automatically finds such models. In particular, it classifies a region according to the proportions of different visual words. The specific visual words and the typical proportions in each object are learned from a segmented training set.
Kesorn and Poslad [80] propose a framework to enhance the visual word quality. First of all, visual words from representative keypoints are constructed by reducing similar keypoints. Second, domain specific noninformative visual words are detected, which are useless for representing the content of visual data but which can degrade the categorization capability. A noninformative visual word is defined as having a high document frequency and a small statistical association with all the concepts in the image collection. Third, the vector space model of visual words is restructured with respect to a structural ontology model in order to solve visual synonym and polysemy problems.
Tirlly et al. [81] present a new image representation called visual sentences that allows us to “read” visual words in a certain order, as in the case of text. Particularly, simple spatial relations between visual words are considered. In addition, pLSA is used to eliminate the noisiest visual words.
3.4. Image Segmentation
Effective image segmentation can be an important factor affecting the BoW feature generation. Uijlings et al. [43] study the role of context in the BoW approach. They observe that using the precise localization of object patches based on image segmentation is likely to yield a better performance than the dense sampling strategy, which sample patches of 8 * 8 pixels at every 4th pixel.
Besides point detection, an image can be segmented into several or a fixed number of regions or blocks. However, very few compared the effect of image segmentation on generating the BoW feature. In Cheng and Wang [82], 20–50 regions per image are segmented, and each region is represented by a HSV histogram and cooccurrence texture features. By using contextual Bayesian networks to model spatial relationship between local regions and integrating multiattributes to infer high-level semantics of an image, this approach performs better and is comparable with a number of works using SIFT descriptors and pLSA for image annotation.
Similarly, Wu et al. [46] extract a texture histogram from the 8 * 8 blocks/patches per image based on their proposed visual language modeling method utilizing the spatial correlation of visual words. This representation is compared with the BoW model including pLSA and LDA using the SIFT descriptor. They show that neither image segmentation nor interest point detection is used in the visual language modeling method, which makes the method not only very efficient, but also very effective over the Caltech 7 dataset.
In addition to using the BoW feature for image annotation, Larlus et al. [83] combine BoW with random fields and some generative models, such as a Dirichlet processes for more effective object segmentation.
3.5. Others
3.5.1. BoW Applications
Although the BoW model has been extensively studied for general object and scene categorization, it has also been considered in some domain specific applications, such as human action recognition [84], facial expression recognition [85], medical images [86], robot, sport image analysis [80], 3D image retrieval and classification [87, 88], image quality assessment [89], and so forth.
3.5.2. Describing Objects/Scenes for Recognition
Farhadi et al. [90] propose shifting the goal of recognition from naming to describing. That is, they focus on describing objects by their attributes, which is not only to name familiar objects, but also to report unusual aspects of a familiar object, such as “spotty dog”, not just “dog”, and to say something about unfamiliar objects, such as “hairy and four-legged”, not just “unknown”.
On the other hand, Sudderth et al. [91] develop hierarchical, probabilistic models for objects, the parts composing them, and the visual scenes surrounding them. These models share information between object categories in three distinct ways. First, parts define distributions over a common low-level feature vocabulary. Second, objects are defined using a common set of parts. Finally, object appearance information is shared between the many scenes in which that object is found.
3.5.3. Query Expansion
Chum et al. [52] adopt the BoW architecture with spatial information for query expansion, which has proven successful in achieving high precision at low recall. On the other hand, Philbin et al. [92] quantize a keypoint to the -nearest visual words as a form of query expansion.
3.5.4. Similarity Measure
Based on the BoW feature representation, Jegou et al. [74] introduce a contextual dissimilarity measure (CDM), which is iteratively obtained by regularizing the average distance of each point to its neighborhood. In addition, CDM is learned in an unsupervised manner, which does not need to learn the distance measure from a set of training images.
3.5.5. Large Scale Image Databases
Since the aim of image annotation is to support very large scale keyword-based image search, such as web image retrieval, it is very critical to assess existing approaches over some large scale dataset(s). Chum et al. [52], Hörster and Lienhart [21], and Lienhart and Slaney [93] used datasets composed of 100000 to 250000 images belonging to 12 categories, which were downloaded from Flickr.
Moreover, Philbin et al. [45] use over 1000000 images from Flickr for experiments and Zhang et al. [94] use about 370000 images collected from Google belonging to 1506 object or scene categories.
On the other hand, Torralba and Efros [95] study some bias issues of object recognition datasets. They provide some suggestions for creating a new and high quality dataset to minimize the selection bias, capture bias, and negative set bias. Furthermore, they claim that in the state of today’s datasets there are virtually no studies demonstrating cross-dataset generalization, for example, training on ImageNet, while testing on PASCAL VOC. This could be considered as an additional experimental setup for future works.
3.5.6. Integration of Feature Selection and/or (Spatial) Feature Extraction
Although modeling the spatial relationship between visual words can improve the recognition performance, the spatial features are expensive to compute. Liu et al. [96] propose a method that simultaneously performs feature selection and (spatial) feature extraction based on higher-order spatial features for speed and storage improvements.
For the dimensionality reduction purpose, Elfiky et al. [97] present a novel framework for obtaining a compact pyramid representation. In particular, the divisive information theoretic feature clustering (DITC) algorithm is used to create a compact pyramid representation.
Bosch et al. [98] investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In their approach, latent “topics” using pLSA are first of all discovered, and a generative model is then applied to the BoW representation for each image.
In contrast to reducing the dimensionality of the feature representation, selecting more discriminative features (e.g., SIFT descriptors) from a given set of training images has been considered. Shang and Xiao [99] introduce a pairwise image matching scheme to select the discriminative features. Specifically, the feature weights are updated by the labeled information from the training set. As a result, the selected features corresponding to the foreground content of the images can highlight the information category of the images.
3.5.7. Integration of Segmentation, Classification, and/or Retrieval
Simultaneously learning object/scene category models and performing segmentation on the detected objects were studied in Cao and Fei-Fei [44]. They propose a spatially coherent latent topic model (Spatial-LTM), which represents an image containing objects in a hierarchical way by oversegmented image regions of homogeneous appearances and the salient image patches within the regions. It can provide a unified representation for spatially coherent BoW topic models and can simultaneously segment and classify objects.
On the other hand, Tong et al. [100] propose a statistical framework for large-scale near duplicate image retrieval which unifies the step of generating a BoW representation and the step of image retrieval. In this approach, each image is represented by a kernel density function, and the similarity between the query image and a database image is then estimated as the query likelihood.
Shotton et al. [101] utilize semantic texton forests, which are ensembles of decision trees that act directly on image pixels, where the nodes in the trees provide an implicit hierarchical clustering into semantic textons and an explicit local classification estimate. In addition, the bag of semantic textons combines a histogram of semantic textons over an image region with a region prior category distribution, and the bag of semantic textons is computed over the whole image for categorization and over local rectangular regions for segmentation.
3.5.8. Discriminative Learning Models
Romberg et al. [102] extend the standard single-layer pLSA to multiple layers, where the multiple layers handle multiple modalities and a hierarchy of abstractions. In particular, the multilayer multimodal pLSA (mm-pLSA) model is based on a two leaf-pLSAs and a single top-level pLSA node merging the two leaf-pLSAs. In addition, SIFT features and image annotations (tags) as well as the combination of SIFT and HOG features are considered as two pairs of different modalities.
3.5.9. Novel Category Discovery
In their study, Lee and Grauman [103] discover new categories by knowing some categories. That is, previously learned categories are used to discover their familiarity in unsegmented, unlabeled images. In their approach, two variants of a novel object-graph descriptor to encode 2D and 3D spatial layout of object-level cooccurrence patterns relative to an unfamiliar region, and they are used to model the interaction between an image’s known and unknown objects for detecting new visual categories.
3.5.10. Interest Point Detection
Since interest point detection is an important step for extracting the BoW feature, Stottinger et al. [104] propose color interest points for sparse image representation. Particularly, light-invariant interest points are introduced to reduce the sensitivity to varying imaging conditions. Color statistics based on occurrence probability lead to color boosted points, which are obtained through saliency-based feature selection.
4. Comparisons of Related Work
This section compares related work in terms of the ways the BoW feature and experimental setup are structured. These comparisons allow us to figure out the most suitable interest point detector(s), clustering algorithm(s), and so forth used to extract the BoW feature from images. In addition, we are able to realize the most widely used dataset(s) and experimental settings for image annotation by BoW.
4.1. Methodology of BoW Feature Generation
Table 1 compares related work for the methodology of extracting the BoW feature. Note that we leave a blank if the information in our comparisons is not clearly described in these related works.

Table 1: Comparisons of interest point detection, visual words generation, and learning models.

From Table 1 we can observe that the most widely used interest point detector for generating the BoW feature is DoG, and the second and third most popular detectors are Harris-Laplace and Hessian-Laplace, respectively. Besides extracting sparse BoW features, many related studies have focused on dense BoW features.
On the other hand, several studies used some region segmentation algorithms, such as NCuts [116] and Mean-shift [117], to segment an image into several regions to represent keypoints.
For the local feature descriptor to describe interest points, most studies used a 128 dimensional SIFT feature, in which some considered using PCA to reduce the dimensionality of SIFT, but some “fuse” the color feature and SIFT resulting in longer dimensional features than SIFT. Except for extracting SIFT related features, some studies considered conventional color and texture features to represent local regions or points.
About vector quantization, we can see that -means is the most widely used clustering algorithm to generate the codebook or visual vocabularies. However, in order to solve the limitations of -means, for example, clustering accuracy and computational cost, some studies used hierarchical -means, approximate -means, accelerated -means, and so forth.
For the number of visual words, related works have considered various amounts of clusters during vector quantization. This may be because the datasets used in these works are different. In Jiang et al. [17], different numbers of visual words were studied, and their results show that 1000 is a reasonable choice. Some related studies also used similar numbers of visual words to generate their BoW features.
On the other hand, the most and second most widely used weighting schemes are TF and TF-IDF. This is consistent with Jiang et al. [17], who concluded that these two weighting schemes perform better than the other weighting schemes.
Finally, SVM is no doubt the most popular classification technique as the learning model for image annotation. In particular, one of the most widely used kernel functions for constructing the SVM classifier is the Gaussian radial basis function. However, some other SVM classifiers, such as linear SVM and SVM with a polynomial kernel have also been considered in the literature.
4.2. Experimental Design
Table 2 compares related work for the experimental design. That is, the chosen dataset(s) and baseline(s) are examined.

Table 2: Comparisons of datasets used and annotation performance.

According to Table 2, most studies considered more than one single dataset for their experiments, and many of them contained object and scene categories. This is very important for image annotation that the annotated keywords should be broadened for users to perform keyword-based queries for image retrieval.
Specifically, the PASCAL, Caltech, and Corel datasets are the three most widely used benchmarks for image classification. However, the datasets used in most studies usually contain a small number of categories and images, except for the studies focusing on retrieval rather than classification. That is, similar based queries are used to retrieve relevant images instead of training a learning model to classify unknown images into one specific category.
For the chosen baselines, most studies compared BoW and/or spatial pyramid matching based BoW since their aims were to propose novel approaches to improve these two feature representations. Specifically, Lazebnik et al. [48] proposed spatial pyramid matching based BoW as the most popular baseline.
Besides improving the feature representation per se, some studies focused on improving the performance of LDA and/or pLSA discriminative learning models. Another popular baseline is that of Fei-Fei and Perona [31], who proposed a Bayesian hierarchical model to represent each region as part of a “theme.”
4.3. Discussion
The above comparisons indicate several issues that were not examined in the literature. Since the local features can be represented using object-based regions by region segmentation [143, 144] or point-based regions by point detection (c.f. Section 2.1), regarding the BoW feature based on tokenizing, it is unknown which local feature is more appropriate for large scale image annotation (For large scale image annotation, this means that the number of annotated keywords is certainly large and their meanings are very broad, containing object and scene concepts.)
In addition, the local feature descriptor is the key component to the success of better image annotation; it is a fact that the number of visual words (i.e., clusters) is another factor affecting image annotation performance. Although Jiang et al. [17] conducted a comprehensive study of using various amounts of visual words, they only used one dataset, that is, TRECVID, containing 20 concepts. Therefore, one important issue is to provide the guidelines for determining the number of visual words over different kinds of image datasets having different image contents.
The learning techniques can be divided into generative and discriminative models, but there are very few studies which assess their annotation performance over different kinds of image datasets which is necessary in order to fully understand the value of these two kinds of learning models. On the other hand, a combination of generative and discriminative learning techniques [145] or hybrid models are considered for the image annotation task.
For the experimental setup, the target of most studies was not image retrieval. In other words, the performance evaluation was usually for small scale problems based on datasets containing a small number of categories, say 10. However, image retrieval users will not be satisfied with a system providing only 10 keyword-based queries to search relevant images. Some benchmarks are much more suitable for larger scale image annotation, such as the Large Scale Visual Recognition Challenge 2012 (ILSVRC2012) by ImageNet (http://www.image-net.org/challenges/LSVRC/2012/index) and Photo Annotation and Retrieval 2012 by ImageCLEF (http://www.imageclef.org/2012/photo). In particular, the ImageNet dataset contains over 10000 categories and 10000000 labeled images and ImageCLEF uses a subset of the MIRFLICKR collection (http://press.liacs.nl/mirflickr/), which contains 25 thousand images and 94 concepts.
However, it is also possible that some smaller scale datasets composed of a relatively small number of images and/or categories can be combined into larger datasets. For example, the combination of Caltech 256 and Corel could be regarded as a benchmark that is more close to the real world problem.
5. Conclusion
In this paper, a number of recent related works using BoW for image annotation are reviewed. We can observe that this topic has been extensively studied recently. For example, there are many issues for improving the discriminative power of BoW feature representations by such techniques as image segmentation, vector quantization, and visual vocabulary construction. In addition, there are other directions for integrating the BoW feature for different applications, such as face detection, medical image analysis, 3D image retrieval, and so forth.
From comparisons of related work, we can find the most widely used methodology to extract the BoW feature which can be regarded as a baseline for future research. That is, DoG is used as the kepoint detector and each keypoint is represented by the SIFT feature. The vector quantization step is based on the -means clustering algorithm with 1000 visual words. However, the number of visual words (i.e., the  values) is dependent on the dataset used. Finally, the weighting scheme can be either TF or TF-IDF.
On the other hand, for the dataset issue in the experimental design, which can affect the contribution and final conclusion, the PASCAL, Caltech, and/or Corel datasets can be used as the initial study.
According to the comparative results, there are some future research directions. First, the local feature descriptor for vector quantization usually by point-based SIFT feature can be compared with other descriptors, such as a region-based feature or a combination of different features. Second, a guideline for determining the number of visual words over what kind of datasets should be provided. The third issue is to assess the performance of generative and discriminative learning models over different kinds of datasets, such as different dataset sizes and different image contents, for example, a single object per image and multiple objects per image. Finally, it is worth examining the scalability of BoW feature representation for large scale image annotation.

References

from:　http://www.hindawi.com/journals/isrn/2012/376804/


展开全文
• 《Few-Shot Representation Learning for Out-Of-Vocabulary Words》 这篇文章是发表在2019年NAACL上的，主要是针对out of vocabulary问题提出的想法。 这里感觉和bert，elmo的想法类似，提炼的都是词向量，只不过这...
《Few-Shot Representation Learning for Out-Of-Vocabulary Words》
这篇文章是发表在2019年NAACL上的，主要是针对out of vocabulary问题提出的想法。
这里感觉和bert，elmo的想法类似，提炼的都是词向量，只不过这篇文章是将提炼词向量去解决out of Vocabulary这一个小问题上，切入点比较好；而且这篇文章自己提出了一个模型去进行词向量的学习。而之前的语言模型是，从大规模语料当中学习到语言学的一些通用知识，从而去进行下游任务。
这篇文章提炼词向量的方法是通过层级context encoder+Model Agnostic Meta-Learning (MAML)学习算法。在Rare-NER和词性标注的下游任务中取得了显著的改善。
分以下四部分介绍：
MotivationModelExperimentDiscussion
1、Motivation
在真实世界当中，out of vocabulary, 不会频繁的出现在训练语料里，对这部分的表示进行学习是一个challenge。论文提出了一种层级注意力结构对有限的observations进行词向量表示。 用一个词的上下文信息去进行编码，并且只使用K个observations去训练模型（目的就是希望模型能够准确对出现频率较少的词进行表示）。
为了使模型对新的语料有着更好的鲁棒性，提出了一种新的训练方法ModelAgnostic Meta-Learning (MAML)
2、Model
2.1 The Few-Shot Regression Framework
Problem formulation
首先在训练集上，我们使用训练方法产生词向量。这些词向量作为我们训练的一个目标标签，Oracle embedding。训练方法（MAML）如下：首先从大规模语料当中选出n个词，对于每一个词，我们可以用St表式所有包含这个词的句子。为了训练我们的模型，解决out of vocabulary的问题，我们随机的采样所有词的k(2, 4, 6)个句子，形成一个episode，该episode为一个小样本。在这个语料当中去训练我们的模型，然后用新的测试语料Dn微调。如此反复进行训练。同时加入字符特征。最后我们选择余弦距离作为我们的评价指标，目标是想让模型生成的词向量和oracle embedding 尽可能的接近。
2.2 Hierarchical Context Encoding (HiCE)
模型结构如下：  该模型也是训练一个语言模型， 输入：包含

w

t

w_t

的

k

k

个句子

s

t

,

k

s_{t, k}

，其中

w

t

w_t

w

t

w_t

的词向量表示
模型结构主要分为两部分，第1部分是context encoder，第2部分是，Multi context Aggregator。
第1部分主要是对输入去进行一个transformer encoder的编码，得到每一个句子的表示，第2部分将这每一个句子的表示进行连接，再经过一个transformer encoder，并和字符特征去进行concatenation最后输出层得到词向量。模型结构比较简单，通过此模型结构能够去补获上下文的信息以及整体的全局信息。
都是标准的transformer，模型参数和计算过程就不赘述了。
3、Experiment
Present two types of experiments to evaluate the effectiveness of the proposed HiCE model.
intrinsic evaluation–WikiText-103 (Merity et al., 2017)[1] WikiText-103 which used as Dt contains 103 million words extracted from a selected set of articlesextrinsic evaluation
3.1 Intrinsic Evaluation: Evaluate OOV Embeddings on the Chimera Benchmark
Evaluate HiCE on Chimera (Lazaridou et al., 2017)[2], a widely used benchmark dataset for evaluating word embedding for OOV words，对于每一个OOV的单词，只有几个句子会出现，用Spearman correlation去评估结果的好坏。结果如下：  1、We can see that adapting with MAML can improve the performance when the number of context sentences is relatively large (i.e., 4 and 6 shot), as it can mitigate the semantic gap between source corpus

D

T

D_T

and target corpus DN
3.2 Extrinsic Evaluation: Evaluate OOV Embeddings on Downstream Tasks
Named Entity Recognition
Rare-NER: focus on unusual, previouslyunseen entities in the context of emerging discussionsBio-NER: focuses on technical terms in the biology domain
2、 The experiment demonstrates that HiCE trained on DT is already able to leverage the general language knowledge which can be transferred through different domains, and adaptation with MAML can further reduce the domain gap and enhance the performance
模型结构还是挺强的，上面说道HICE已经可以学习到跨领域的通用知识，并且通过MAML能够更好地减少领域鸿沟
4、Discussion
1、首先文章的切入点比较好，针对NLP领域的一个小问题，即OOV问题提出自己的解决方法。
2、从大规模语料库中提取词向量，并且使用一种新的结构作为语言模型提炼语言学中的一些通用知识。
3、为了减少领域之间的gap问题，使用MAML的学习方法，加强模型的鲁棒性。
4、实验中应该加入HICE+MAML的对比试验，因为MAML的微调既会对Morph产生影响，又会对结构产生影响。
5、如果直接用bert在这几个任务上实验，效果如何。
Reference
[1]Stephen Merity, Caiming Xiong, James Bradbury, and Richard Socher. 2017. Pointer sentinel mixture models. In ICLR’17 [2]Angeliki Lazaridou, Marco Marelli, and Marco Baroni.2017. Multimodal word meaning induction from minimal exposure to natural text. Cognitive Science.
展开全文
• ## R语言笔记一

万次阅读 多人点赞 2016-06-19 21:44:10
BUT: The one exception is a list, which is represented as a vector but can contain objects of different classes (indeed, that’s usually why we use them) Empty vectors can be created with the ...
常用函数
object.size() ##查询数据大小  names() ##查询数据变量名称  head(x, 10) ,tail(x, 10) ##查询数据前/后10行  summary() ##对数据集的详细统计呈现  table(x$y) ##对y值出现次数统计 str() ##查询数据集/函数的详细结构 nrow(),ncol() ##查询行列数 sqrt(x) ##square root取x的平方根 abs(x) ##absolute value取x的绝对值 names(vect2)<-c(“foo”,”bar”,”norf”) ##给向量命名 identical(vect,vect2) ##TRUE 检查两个向量是否一样 vect[c(“foo”,”bar”)] ##用名字选取向量 colnames(my_data)<-cnames ##修改数据框的列名 t() ##互换数据框的行列 length(“”)统计字符数，空字符时计数为1 nchar(“”)统计字符数，空字符时计数为0 tolower()将字符转换为小写 toupper()将字符转换为大写 chartr(“A”,”B”,x):字符串x中使用B替换A na.omit()，移除所有含有缺失值的观测（行删除，listwise deletion） paste() paste("Var",1:5,sep="") [1] "Var1" "Var2" "Var3" "Var4" "Var5" > x<-list(a='aaa',b='bbb',c="ccc") > y<-list(d="163.com",e="qq.com") > paste(x,y,sep="@") [1] "aaa@163.com" "bbb@qq.com" "ccc@163.com" #增加collapse参数，设置分隔符 > paste(x,y,sep="@",collapse=';') [1] "aaa@163.com;bbb@qq.com;ccc@163.com" > paste(x,collapse=';') [1] "aaa;bbb;ccc"  strsplit()字符串拆分 strsplit(x, split, fixed = FALSE, perl = FALSE, useBytes = FALSE) x为需要拆分的字串向量 split为拆分位置的字串向量，默认为正则表达式匹配（fixed=FALSE）， 设置fixed=TRUE，表示使用普通文本匹配或正则表达式的精确匹配。普通文本的运算速度快 perl=TRUE/FALSE的设置和perl语言版本有关，如果正则表达式很长，正确设置表达式并且使用perl=TRUE可以提高运算速度。 useBytes设置是否逐个字节进行匹配，默认为FALSE，即按字符而不是字节进行匹配。 strsplit得到的结果是列表，后面要怎么处理就得看情况而定了  字符串替换：sub(),gsub() 严格地说R语言没有字符串替换的函数 R语言对参数都是传值不传址 sub和gsub的区别是前者只做一次替换，gsub把满足条件的匹配都做替换 > text<-c("Hello, Adam","Hi,Adam!","How are you,Ava") > sub(pattern="Adam",replacement="word",text) [1] "Hello, word" "Hi,word!" "How are you,Ava" > sub(pattern="Adam|Ava",replacement="word",text) [1] "Hello, word" "Hi,word!" "How are you,word" > gsub(pattern="Adam|Ava",replacement="word",text) [1] "Hello, word" "Hi,word!" "How are you,word"  字符串提取substr(), substring() substr和substring函数通过位置进行字符串拆分或提取，它们本身并不使用正则表达式 结合正则表达式函数regexpr、gregexpr或regexec使用可以非常方便地从大量文本中提取所需信息 语法格式 substr(x, start, stop) substring(text, first, last = 1000000L) 第 1个参数均为要拆分的字串向量，第2个参数为截取的起始位置向量，第3个参数为截取字串的终止位置向量 substr返回的字串个数等于第一个参数的长度 substring返回字串个数等于三个参数中最长向量长度，短向量循环使用 > x <- "123456789" > substr(x, c(2,4), c(4,5,8)) [1] "234" > substring(x, c(2,4), c(4,5,8)) [1] "234" "45" "2345678" 因为x的向量长度为1，substr获得的结果只有1个字串， 即第2和第3个参数向量只用了第一个组合：起始位置2，终止位置4。 substring的语句三个参数中最长的向量为c(4,5,8)，执行时按短向量循环使用的规则第一个参数事实上就是c(x,x,x)， 第二个参数就成了c(2,4,2)，最终截取的字串起始位置组合为：2-4, 4-5和2-8。  Workspace and Files ls() ##查询工作区对象 list.files(), dir() ##列出工作目录所有文件 dir.create(“testdir”) ##创建testdir目录 file.create(“mytest.R”) ##创建mytest.R文件 file.exists(“mytest.R”) ##查询文件是否存在 file.info(“mytest.R”) ， file.info(“mytest.R”)$mode ##查询文件包含信息，或特定信息  file.rename(“mytest.R”,”mytest2.R”) ##重命名为mytest2.R  file.remove(“mytest.R”) ##删文件  file.copy(“mytest2.R”,”mytest3.R”) ##复制为mytest3.R文件  file.path(“mytest3.R”) ##在众多工作文件中，指定提供某个文件的相对路径。  file.path(“folder1”,”folder2”) ##”folder1/folder2”也能创建独立于系统的路径供R工作。？
Create a directory in the current working directory called “testdir2” and a subdirectory for it called “testdir3”, all in one command by using dir.create() and file.path().
 dir.create(file.path('testdir2','testdir3'),recursive = TRUE)

setwd('testdir')     ##设testdir目录，为工作目录
> old.dir <- getwd()
args()  ##查询函数参数构成
sample(x) ##也可以对x重新排序
> sample(1:6, 4, replace = TRUE)
[1] 4 5 1 3

>flips <- sample(c(0,1),100,replace = TRUE, prob = c(0.3,0.7)) #prob设定0和1出现的概率

> flips
[1] 1 1 1 1 1 1 1 0 1 1 1 0 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1
[47] 1 0 1 1 1 1 1 0 1 0 1 1 1 1 0 0 1 1 1 1 0 1 1 1 1 0 0 0 0 0 1 0 0 1 1 0 1 1 1 1 1 0 0 1 1 1
[93] 1 1 1 1 1 0 1 1

Sequence of Numbers
> 1:10
[1]  1  2  3  4  5  6  7  8  9 10

>pi:10   ##real numbers 实数
[1] 3.141593 4.141593 5.141593 6.141593 7.141593 8.141593 9.141593

？‘：’查询操作符号：
> seq(1,10)
[1]  1  2  3  4  5  6  7  8  9 10

> seq(0, 10, by=0.5)
[1]  0.0  0.5  1.0  1.5  2.0  2.5  3.0  3.5  4.0  4.5  5.0  5.5  6.0  6.5  7.0  7.5  8.0  8.5
[19]  9.0  9.5 10.0

>my_seq<- seq(5,10,length=30)  ##在区间（5, 10）等距生成30个数
> 1:length(my_seq)
[1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
> seq(along.with = my_seq)
[1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

> seq_along(my_seq)  **
[1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

>rep(c(0,1,2),times=10)
[1] 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2
>rep(c(0,1,2),each=10)
[1] 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2

Vector
> paste(1:3,c("X", "Y", "Z"),sep="")
[1] "1X" "2Y" "3Z"

* Vector recycling!*
> paste(LETTERS, 1:4, sep = "-")
[1] "A-1" "B-2" "C-3" "D-4" "E-1" "F-2" "G-3" "H-4" "I-1" "J-2" "K-3" "L-4" "M-1" "N-2" "O-3"
[16] "P-4" "Q-1" "R-2" "S-3" "T-4" "U-1" "V-2" "W-3" "X-4" "Y-1" "Z-2"

数据类型
对象与属性 Objects and Attributes
Objects
R has five basic or “atomic” classes of objects:
characternumeric (real numbers)integercomplexlogical (True/False)
The most basic object is a vector
A vector can only contain objects of the same classBUT: The one exception is a list, which is represented as a vector but can contain objects of different classes (indeed, that’s usually why we use them)
Empty vectors can be created with the vector() function.
Numbers
Numbers in R a generally treated as numeric objects (i.e. double precision real numbers)If you explicitly want an integer, you need to specify the L suffixEx: Entering *1* gives you a numeric object; entering *1L* explicitly gives you an integer **There is also a special number *Inf* which represents infinity; e.g. 1 / 0; Inf can be used in ordinary calculations; e.g. 1 / Inf is 0The value *NaN* represents an undefined value (“not a number”); e.g. 0 / 0; *NaN* can also be thought of as a missing value (more on that later)
Attributes
R objects can have attributes
names, dimnamesdimensions (e.g. matrices, arrays)classlengthother user-defined attributes/metadata  Attributes of an object can be accessed using the attributes() function
向量与列表 Vectors and Lists
Creating Vectors
The c() function can be used to create vectors of objects.
> x <- c(0.5, 0.6) ## numeric
> x <- c(TRUE, FALSE) ## logical
> x <- c(T, F) ## logical
> x <- c("a", "b", "c") ## character
> x <- 9:29 ## integer
> x <- c(1+0i, 2+4i) ## complex

Using the vector() function
> x <- vector("numeric", length = 10)
> x
[1] 0 0 0 0 0 0 0 0 0 0

Mixing Objects
When different objects are mixed in a vector, coercion occurs so that every element in the vector is of the same class.
> y <- c(1.7, "a") ## character
> y <- c(TRUE, 2) ## numeric
> y <- c("a", TRUE) ## character

Explicit Coercion 强制明确
Objects can be explicitly coerced from one class to another using the as.* functions, if available.
> x <- 0:6
> class(x)
[1] "integer"
> as.numeric(x)
[1] 0 1 2 3 4 5 6
> as.logical(x)
[1] FALSE TRUE TRUE TRUE TRUE TRUE TRUE
> as.character(x)
[1] "0" "1" "2" "3" "4" "5" "6"

Nonsensical coercion results in NAs
> x <- c("a", "b", "c")
> as.numeric(x)
[1] NA NA NA
Warning message:
NAs introduced by coercion
> as.logical(x)
[1] NA NA NA
> as.complex(x)
[1] NA NA NA
Warning message:
NAs introduced by coercion

Lists
Lists are a special type of vector that can contain elements of different classes. Lists are a very important data type in R and you should get to know them well.
> x <- list(1, "a", TRUE, 1 + 4i)
> x
[[1]]
[1] 1
[[2]]
[1] "a"
[[3]]
[1] TRUE
[[4]]
[1] 1+4i

矩阵 Matrices
Matrices
Matrices are vectors with a dimension attribute. The dimension attribute is itself an integer vector of length 2 (nrow, ncol)
> m <- matrix(nrow = 2, ncol = 3)
> m
[,1] [,2] [,3]
[1,] NA NA NA
[2,] NA NA NA
> dim(m)
[1] 2 3
> attributes(m) **
$dim [1] 2 3  Matrices (cont’d) Matrices are constructed column-wise, so entries can be thought of starting in the “upper left” corner and running down the columns. > m <- matrix(1:6, nrow = 2, ncol = 3) > m [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6  Matrices can also be created directly from vectors by adding a dimension attribute.** > m <- 1:10 > m [1] 1 2 3 4 5 6 7 8 9 10 > dim(m) <- c(2, 5) ** > m [,1] [,2] [,3] [,4] [,5] [1,] 1 3 5 7 9 [2,] 2 4 6 8 10  cbind-ing and rbind-ing Matrices can be created by column-binding or row-binding with cbind() and rbind(). > x <- 1:3 > y <- 10:12 > cbind(x, y) x y [1,] 1 10 [2,] 2 11 [3,] 3 12 > rbind(x, y) [,1] [,2] [,3] x 1 2 3 y 10 11 12  因子 Factors Factors are used to represent categorical data. Factors can be unordered or ordered. One can think of a factor as an integer vector where each integer has a label. Factors are treated specially by modelling functions like *lm()* and *glm()*Using factors with labels is *better* than using integers because factors are self-describing; having a variable that has values “Male” and “Female” is better than a variable that has values 1 and 2.  x <- factor(c("yes", "yes", "no", "yes", "no")) x [1] yes yes no yes no Levels: no yes table(x) x no yes 2 3 unclass(x) [1] 2 2 1 2 1 attr(,"levels") [1] "no" "yes"  The order of the levels can be set using the levels argument to factor(). This can be important in linear modelling because the first level is used as the baseline level. > x <- factor(c("yes", "yes", "no", "yes", "no"), levels = c("yes", "no")) ** > x [1] yes yes no yes no Levels: yes no  缺失值 Missing Values Missing values are denoted by NA or NaN for undefined mathematical operations. is.na() is used to test objects if they are NAis.nan() is used to test for NaNNA values have a class also, so there are integer NA, character NA, etcA NaN value is also NA but the converse is not true > x <- c(1, 2, NA, 10, 3) > is.na(x) [1] FALSE FALSE TRUE FALSE FALSE > is.nan(x) [1] FALSE FALSE FALSE FALSE FALSE > x <- c(1, 2, NaN, NA, 4) > is.na(x) [1] FALSE FALSE TRUE TRUE FALSE > is.nan(x) [1] FALSE FALSE TRUE FALSE FALSE  数据框 Data Frames Data frames are used to store tabular data (表格数据) They are represented as a special type of list where every element of the list has to have the same lengthEach element of the list can be thought of as a column and the length of each element of the list is the number of rowsUnlike matrices, data frames can store different classes of objects in each column (just like lists); matrices must have every element be the same classData frames also have a special attribute called *row.names*Data frames are usually created by calling *read.table()* or *read.csv()*Can be converted to a matrix by calling *data.matrix()* * > x <- data.frame(foo = 1:4, bar = c(T, T, F, F)) > x foo bar 1 1 TRUE 2 2 TRUE 3 3 FALSE 4 4 FALSE > nrow(x) [1] 4 > ncol(x) [1] 2  Names Attribute 名字属性 Names R objects can also have names, which is very useful for writing readable code and self-describing objects. > x <- 1:3 > names(x) NULL > names(x) <- c("foo", "bar", "norf") > x foo bar norf 1 2 3 > names(x) [1] "foo" "bar" "norf"  Lists can also have names. > x <- list(a = 1, b = 2, c = 3) > x$a
[1] 1
$b [1] 2$c
[1] 3

And matrices.
> m <- matrix(1:4, nrow = 2, ncol = 2)
> dimnames(m) <- list(c("a", "b"), c("c", "d")) ***
> m
c d
a 1 3
b 2 4

Summary
Data Types
atomic classes: numeric, logical, character, integer, complex \vectors, listsfactorsmissing valuesdata framesnames
There are a few principal functions reading data into R.
*read.table()*, *read.csv()*, for reading tabular data*readLines()*, for reading lines of a text file*source()*, for reading in R code files (inverse of dump)***dget()*, for reading in R code files (inverse of dput)***load()*, for reading in saved workspaces*unserialize()*, for reading single R objects in binary form
Writing Data
There are analogous functions for writing data to files.
write.table()writeLines()dump()dput()save()serialize()
The read.table function is one of the most commonly used functions for reading data. It has a few important arguments:
*file*, the name of a file, or a connection*header*, logical indicating if the file has a header line*sep*, a string indicating how the columns are separated*colClasses*, a character vector indicating the class of each column in the dataset*nrows*, the number of rows in the dataset*comment.char()*, a character string indicating the comment character*skip*, the number of lines to skip from the beginning*stringsAsFactors*, should character variables be coded as factors?
read.table  For small to moderately sized datasets, you can usually call read.table without specifying any other arguments.
data <- read.table("foo.txt")
R will automatically
skip lines that begin with a #figure out how many rows there are (and how much memory needs to be allocatedfigure what type of variable is in each column of the table Telling R all these things directly makes R run faster and more efficiently.*read.csv* is identical to *read.table* except that the default separator is a comma.
With much larger datasets, doing the following things will make your life easier and will prevent R from choking.
Read the help page for read.table, which contains many hintsMake a rough calculation of the memory required to store your dataset. If the dataset is larger than the amount of RAM on your computer, you can probably stop right here.Set comment.char = "" if there are no commented lines in your file. **Use the *colClasses* argument. Specifying this option instead of using the default can make ’read.table’ run MUCH faster, often twice as fast. In order to use this option, you have to know the class of each column in your data frame. If all of the columns are “numeric”, for example, then you can just set *colClasses = "numeric"*. A quick an dirty way to figure out the classes of each column is the following:
initial <- read.table("datatable.txt", nrows = 100) ***
classes <- sapply(initial, class)
colClasses = classes)
Set *nrows*. This doesn’t make R run faster but it helps with memory usage. A mild overestimate is okay. You can use the Unix tool *wc* to calculate the number of lines in a file.
Know Thy System
In general, when using R with larger datasets, it’s useful to know a few things about your system.
How much memory is available?What other applications are in use?Are there other users logged into the same system?What operating system?Is the OS 32 or 64 bit?
Calculating Memory Requirements
I have a data frame with 1,500,000 rows and 120 columns, all of which are numeric data. Roughly, how much memory is required to store this data frame?  1,500,000 × 120 × 8 bytes/numeric
= 1440000000 bytes
= 1440000000 / bytes/MB
= 1,373.29 MB
= 1.34 GB
Textual Formats
*dumping* and *dputing* are useful because the resulting textual format is edit-able, and in the case of corruption, potentially recoverable.*Unlike* writing out a table or csv file, *dump* and *dput* preserve the *metadata* (sacrificing some readability), so that another user doesn’t have to specify it all over again.*Textual* formats can work much better with version control programs like subversion or git which can only track changes meaningfully in text filesTextual formats can be longer-lived; if there is corruption somewhere in the file, it can be easier to fix the problemTextual formats adhere to the “Unix philosophy”Downside: The format is not very space-efficient
dput-ting R Objects ？
Another way to pass data around is by deparsing the R object with dput and reading it back in using dget.
> y <- data.frame(a = 1, b = "a")
> dput(y)
structure(list(a = 1,
b = structure(1L, .Label = "a",
class = "factor")),
.Names = c("a", "b"), row.names = c(NA, -1L),
class = "data.frame")
> dput(y, file = "y.R")
> new.y <- dget("y.R")
> new.y
a    b
1   1    a

Dumping R Objects ？
Multiple objects can be deparsed（逆分析） using the dump function（转储功能） and read back in using source.
> x <- "foo"
> y <- data.frame(a = 1, b = "a")
> dump(c("x", "y"), file = "data.R")
> rm(x, y)
> source("data.R")
> y
a  b
1  1  a
> x
[1] "foo"

Interfaces to the Outside World
Data are read in using connection interfaces. Connections can be made to files (most common) or to other more exotic things.
*file*, opens a connection to a file*gzfile*, opens a connection to a file compressed with gzip*bzfile*, opens a connection to a file compressed with bzip2*url*, opens a connection to a webpage
File Connections **
> str(file)
function (description = "", open = "", blocking = TRUE,
encoding = getOption("encoding"))

1. *description* is the name of the file
2. *open* is a code indicating
- “w” writing (and initializing a new file)
- “a” appending
- “rb”, “wb”, “ab” reading, writing, or appending in binary mode (Windows)

Connections
In general, connections are powerful tools that let you navigate files or other external objects. In practice, we often don’t need to deal with the connection interface directly.
con <- file("foo.txt", "r") **
close(con)
is the same as
data <- read.csv("foo.txt")
Reading Lines of a Text File
> con <- gzfile("words.gz")
> x
[1] "1080"        "10-point"   "10th"         "11-point"
[5] "12-point"  "16-point"   "18-point"  "1st"
[9] "2"              "20-point"

writeLines takes a character vector and writes each element one line at a time to a text file.  readLines can be useful for reading in lines of webpages
## This might take time
con <- url("http://www.jhsph.edu", "r")
[1] "<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\">"
[2] ""
[3] "<html>"
[5] "\t<meta http-equiv=\"Content-Type\" content=\"text/html;charset=utf-8
Subsetting
There are a number of operators that can be used to extract subsets of R objects.
[ always returns an object of the same class as the original; can be used to select more than one element (there is one exception)[[ is used to extract elements of a list or a data frame; it can only be used to extract a single element and the class of the returned object will not necessarily be a list or data frame$is used to extract elements of a list or data frame by name; semantics are similar to that of [[. x <- c(“a”, “b”, “c”, “c”, “d”, “a”) x[1] [1] “a” x[2] [1] “b” x[1:4] [1] “a” “b” “c” “c” x[x > “a”] [1] “b” “c” “c” “d” u <- x > “a” u [1] FALSE TRUE TRUE TRUE TRUE FALSE x[u] [1] “b” “c” “c” “d” Subsetting Lists > x <- list(foo = 1:4, bar = 0.6) > x[1]$foo
[1] 1 2 3 4
> x[[1]]
[1] 1 2 3 4
> x$bar [1] 0.6 > x[["bar"]] [1] 0.6 > x["bar"]$bar
[1] 0.6

> x <- list(foo = 1:4, bar = 0.6, baz = "hello")
> x[c(1, 3)]
$foo [1] 1 2 3 4$baz
[1] "hello"

The [[ operator can be used with computed indices; $can only be used with literal names. > x <- list(foo = 1:4, bar = 0.6, baz = "hello") > name <- "foo" > x[[name]] ## computed index for ‘foo’ ** [1] 1 2 3 4 > x$name       ## element ‘name’ doesn’t exist!
NULL
> x$foo [1] 1 2 3 4 ## element ‘foo’ does exist  Subsetting Nested Elements of a List The [[ can take an integer sequence. > x <- list(a = list(10, 12, 14), b = c(3.14, 2.81)) > x[[c(1, 3)]] ** [1] 14 > x[[1]][[3]] [1] 14 > x[[c(2, 1)]] [1] 3.14  Subsetting a Matrix Matrices can be subsetted in the usual way with (i,j) type indices. > x <- matrix(1:6, 2, 3) > x[1, 2] [1] 3 > x[2, 1] [1] 2  Indices can also be missing. ** > x[1, ] [1] 1 3 5 > x[, 2] [1] 3 4  By default, when a single element of a matrix is retrieved, it is returned as a vector of length 1 rather than a 1 × 1 matrix. This behavior can be turned off by setting drop = FALSE. > x <- matrix(1:6, 2, 3) > x[1, 2] [1] 3 > x[1, 2, drop = FALSE] ** [,1] [1,] 3  Similarly, subsetting a single column or a single row will give you a vector, not a matrix (by default). > x <- matrix(1:6, 2, 3) > x[1, ] [1] 1 3 5 > x[1, , drop = FALSE] [,1] [,2] [,3] [1,] 1 3 5  Partial Matching Partial matching of names is allowed with [[ and$
> x <- list(aardvark = 1:5)
> x\$a
[1] 1 2 3 4 5
> x[["a"]]
NULL
> x[["a", exact = FALSE]] ***
[1] 1 2 3 4 5

Removing NA Values *
A common task is to remove missing values (NAs).
> x <- c(1, 2, NA, 4, NA, 5)
[1] 1 2 4 5

What if there are multiple things and you want to take the subset with no missing values?
> x <- c(1, 2, NA, 4, NA, 5)
> y <- c("a", "b", NA, "d", NA, "f")
> good <- complete.cases(x, y) ***
> good
[1] TRUE TRUE FALSE TRUE FALSE TRUE
> x[good]
[1] 1 2 4 5
> y[good]
[1] "a" "b" "d" "f"

> airquality[1:6, ]
Ozone     Solar.R    Wind       Temp     Month   Day
1       41       190       7.4         67       5       1
2       36       118       8.0         72       5       2
3       12       149       12.6       74       5       3
4       18       313       11.5       62       5       4
5       NA       NA       14.3       56       5       5
6       28       NA 14.9 66 5 6
> good <- complete.cases(airquality)
> airquality[good, ] [1:6, ]   ***
Ozone Solar.R   Wind      Temp       Month     Day
1       41       190       7.4         67       5       1
2       36       118       8.0         72       5       2
3       12       149       12.6       74       5       3
4       18       313       11.5       62       5       4
7       23       299       8.6         65       5       7

Vectorized Operations 向量化操作
Many operations in R are vectorized making code more efficient, concise, and easier to read.
> x <- 1:4; y <- 6:9
> x + y
[1] 7 9 11 13
> x > 2
[1] FALSE FALSE TRUE TRUE
> x >= 2
[1] FALSE TRUE TRUE TRUE
> y == 8
[1] FALSE FALSE TRUE FALSE
> x * y
[1] 6 14 24 36
> x / y
[1] 0.1666667 0.2857143 0.3750000 0.4444444

Vectorized Matrix Operations
> x <- matrix(1:4, 2, 2); y <- matrix(rep(10, 4), 2, 2) ？
> x * y             ## element-wise multiplication
[,1]    [,2]
[1,]    10    30
[2,]    20    40
> x / y
[,1]    [,2]
[1,]    0.1    0.3
[2,]    0.2    0.4
> x %*% y       ## true matrix multiplication
[,1]    [,2]
[1,]      40    40
[2,]      60    60

Missing Value
is.na(mydata) 与 mydata == NA 结果一样
R uses ‘one-based indexing‘, which (you  | guessed it!) means the first element of a vector is considered element 1.
x[c(2, 10)] ##取x的第2个和第10个数  x[c(-2, -10)] ##取除去第2个和第10个的所有数  x[-c(2, 10)] ##同上
展开全文
• Image Retrieval with Bag of Visual Words You can use the Computer Vision System Toolbox™ functions to search by image, also known as a content-based image retrieval (CBIR) system. CBIR systems ...
Image Retrieval with Bag of Visual Words

You can use the Computer Vision System Toolbox™ functions to search by image, also known as a content-based image retrieval (CBIR) system. CBIR systems are used to retrieve images from a collection of images that are similar to a query image. The application of these types of systems can be found in many areas such as a web-based product search, surveillance, and visual place identification. First the system searches a collection of images to find the ones that are visually similar to a query image.

The retrieval system uses a bag of visual words, a collection of image descriptors, to represent your data set of images. Images are indexed to create a mapping of visual words. The index maps each visual word to their occurrences in the image set. A comparison between the query image and the index provides the images most similar to the query image. By using the CBIR system workflow, you can evaluate the accuracy for a known set of image search results.

Retrieval System Workflow
Create image set that represents image features for retrieval. Use imageSet to store the image data. Use a large number of images that represent various viewpoints of the object. A large and diverse number of images helps train the bag of visual words and increases the accuracy of the image search.   Type of feature. The indexImages function creates the bag of visual words using the speeded up robust features (SURF). For other types of features, you can use a custom extractor, and then use bagOfFeatures to create the bag of visual words. See the Create Search Index Using Custom Bag of Features example.  You can use the original imgSet or a different collection of images for the training set. To use a different collection, create the bag of visual words before creating the image index, using the bagOfFeatures function. The advantage of using the same set of images is that the visual vocabulary is tailored to the search set. The disadvantage of this approach is that the retrieval system must relearn the visual vocabulary to use on a drastically different set of images. With an independent set, the visual vocabulary is better able to handle the additions of new images into the search index.   Index the images. The indexImages function creates a search index that maps visual words to their occurrences in the image collection. When you create the bag of visual words using an independent or subset collection, include the bag as an input argument to indexImages. If you do not create an independent bag of visual words, then the function creates the bag based on the entire imgSet input collection. You can add and remove images directly to and from the image index using the addImages and removeImages methods.   Search data set for similar images. Use the retrieveImages function to search the image set for images which are similar to the query image. Use the NumResults property to control the number of results. For example, to return the top 10 similar images, set the ROI property to use a smaller region of a query image. A smaller region is useful for isolating a particular object in an image that you want to search for.

Evaluate Image Retrieval
Use the evaluateImageRetrieval function to evaluate image retrieval by using a query image with a known set of results. If the results are not what you expect, you can modify or augment image features by the bag of visual words. Examine the type of the features retrieved. The type of feature used for retrieval depends on the type of images within the collection. For example, if you are searching an image collection made up of scenes, such as beaches, cities, or highways, use a global image feature. A global image feature, such as a color histogram, captures the key elements of the entire scene. To find specific objects within the image collections, use local image features extracted around object keypoints instead.

Related Examples
Image Retrieval Using Customized Bag of Features   from: http://cn.mathworks.com/help/vision/ug/image-retrieval-with-bag-of-visual-words.html
展开全文
• A CAPTCHA or a “Completely Automated Public Turing test to tell Computers and Humans Apart,” comes in several shapes, sizes and types. These all work quite well against spam, but some are ...
• 词带模型：Bag of Words Meets Bags of Popcorn(1)-Bag of Words Tfidf模型：Bag of Words Meets Bags of Popcorn(2)-tfidf 这一节采用词向量 1、读取数据 import pandas as pd train=pd.read_csv('./data/...
• FROM: ... An example of a typical bag of words classification pipeline. Figure by Chatfield et al. Project 3: Scene recognition with bag of words CS
• ## 流利说 Level 4 全文

万次阅读 多人点赞 2019-05-22 10:52:40
Mountains are formed by forces deep within the Earth, and are made of different types of rocks. Rivers are streams of water that usually begin in mountains and flow into the sea. Many early cities ...
• An example of a typical bag of words classification pipeline. Figure by Chatfield et al. Project 3: Scene recognition with bag of words CS 143: Introduction to Computer Vision Brief
• ## TYPESOF TESTING

千次阅读 2010-02-24 13:13:00
TYPES OF TESTING 1. Black Box Testing. 21.1 FUNCTIONAL TESTING.. 21.2 STRESS TESTING.. 21.3 LOAD TESTING.. 31.4 AD-HOC TESTING.. 31.5 EXPLORATORY TESTING.. 31.6 USABILITY TESTING..
• 最新版的Aspose.Words for .NET 破解版，支持在word中插入图表 Document doc = new Document(); DocumentBuilder builder = new DocumentBuilder(doc); // Add chart with default data. You can specify ...
•   Every human being is ,in one way or another,unique.   Everyone’s personality is determined by a ...Let us examin ten personality types(one of which might by chance be your ver
• he 3 Types of Buyers, and How to Optimize for Each One [Guest post by Jeremy Smith.] I absolutely love buyer psychology and neuroeconomics. Want to know why? ● Because it’s like a secret ...
• ## 流利说 Level 3 全文

万次阅读 多人点赞 2019-05-22 10:51:17
Lesson 4 Types of Words Dialogue Lesson 5 Good News & Bad News 4/4 Listening Lesson 1 Leonardo da Vinci 1-2 Vocabulary Lesson 3 Sources of Pollution Lesson 4 Historical Figures...
• Thinking with Types started, as so many of my projects do, accidentally. I was unemployed, bored, and starting to get tired of answering the same questions over and over again in Haskell chat-rooms. ...
• ## Proof of Stake FAQ

万次阅读 2019-03-26 09:52:08
Contents ... What are the benefits of proof of stake as opposed to proof of work? How does proof of stake fit into traditional Byzantine fault tolerance research? What is the "n...
• balancof通常可以有两种用法： 查询余额 查询余额并空投币查询余额 一般会有如下代码 contract Test { address owner = msg.sender; mapping (address =&... function balanceOf(address _owner) public ret...
• aspose-words安装部署 自动更新目录 参考网站： aspose-words github：...
• Aspose.Words是一款先进的文档处理控件，在不使用Microsoft Words的情况下，它可以使用户在各个应用程序中执行各种文档处理任务，其中包括文档的生成、修改、渲染、打印，文档格式转换和邮件合并等文档处理。...
• ## Unit 4: Sentence Types

千次阅读 2014-10-04 17:35:44
SENTENCE TYPES   Simple Sentences   Compound Sentences   Complex Sentences   Compound-Complex Sentences When you start to put together all the clauses and phrase
• 使用aspose.words为Word插入水印
• emnlp 接受论文列表地址:... 其中216篇是长篇论文，107篇是短篇论文。详细如下： Accepted short papersGlobal Normalization of Convolutional Neural Networks for Joint Entity and
• Table 9.2 Keywords and Reserved Words in MySQL 5.5 ACCESSIBLE (R) ACTION ADD (R) AFTER AGAINST AGGREGATE ALGORITHM ALL (R) ALTER (R) ANALYZE (R) AND (R) ANY AS (R)...
• ## cs224n 2019assignment 1

千次阅读 2020-07-06 04:36:31
corpus_words = len(ans_test_corpus_words) Test correct number of words assert(num_corpus_words == ans_num_corpus_words), “Incorrect number of distinct words. Correct: {}. Yours: {}”.format(ans_num_...
• Types of Patches Individual Patch, One-off, Standalone These are terms used to describe an individual patch that is created to fix one particular bug. Currently, most Application products deliver ...
• Removal of mapping types Indices created in Elasticsearch 7.0.0 or later no longer accept a default mapping. Indices created in 6.x will continue to function as before in Elasticsearch 6.x. Types are...
• ## Python Types and Objects

千次阅读 2013-12-05 18:28:19
• Lecture 4 Part-Of-Speech TaggingLearning ObjectivePart-of-Speech TaggingIntroduction to Part-Of-Speech (POS) TaggingPOS Tag SetsOn-Line Part-of-Speech (POS) Tagging DemosPOS Tagging ApproachRule-based...
• Introduction   This article will explain 6 important concepts ...Stack , heap , value types , reference types , boxing and unboxing. This article starts first explaining what happens internally when
• ArnetMiner: Extraction and Mining of Academic Social Networks ABSTRACT This paper addresses several key issues in the ArnetMiner system, which aims at extracting and mining academic social networks. ...

...