精华内容
下载资源
问答
  • redshift

    2016-12-21 12:04:33
    redshift based on postgreSQL pgsql vs mysql same:关系型数据库 difference 列vs 行 name: 数据仓库,数据库 php need pgsql pdo_pgsql 来连接操作 参考 为什么列存储数据库读取速度会比传统的行...

    redshift based on postgreSQL
    pgsql vs mysql
    same:关系型数据库
    difference
    列vs 行
    name: 数据仓库,数据库
    php need pgsql pdo_pgsql 来连接操作

    参考
    为什么列存储数据库读取速度会比传统的行数据库快

    展开全文
  • redshift 数据仓库| 亚马逊Redshift (Data Warehousing | Amazon Redshift) 了解Redshift的基础 (Understanding the foundations of Redshift) Data Engineers or even analysts, it is important to understand ...

    redshift

    数据仓库| 亚马逊Redshift (Data Warehousing | Amazon Redshift)

    Data Engineers or even analysts, it is important to understand the technology to utilise it fully and efficiently. In many cases, Redshift is seen as a traditional database like SQL Server and management is left to DBAs. I would argue that if Redshift best practices are followed, the role of dedicated DBA diminishes to occasional management and upkeep.

    数据工程师乃至分析师,了解该技术以充分,有效地利用它很重要。 在许多情况下,Redshift被视为像SQL Server这样的传统数据库,而管理则留给DBA。 我会争辩说,如果遵循Redshift最佳实践,则专用DBA的角色将减少偶发性管理和维护。

    In this post, we’ll discover the architecture and understand the effect and impact each component has on queries.

    在本文中,我们将探索架构,并了解每个组件对查询的影响和影响。

    单纯的观点 (Simplistic View)

    From 10,000 ft, Redshift appears like any other relational database with fairly standard SQL and entities like tables, views, stored procedures, and usual data types.

    在10,000英尺的高度上,Redshift看起来像任何其他具有相当标准SQL的关系数据库以及诸如表,视图,存储过程和常用数据类型之类的实体。

    We’ll start with Tables as these are containers for persistent data storage and will allow us to dive vertically into the architecture. This is what Redshift looks like from 10,000 ft:

    我们将从表开始,因为表是用于持久数据存储的容器,并且将使我们能够垂直进入体系结构。 这是10,000英尺的Redshift的样子:

    Image for post
    Simplistic 10,000ft view
    10,000英尺的简单视图

    Redshift is a clustered warehouse and each cluster can house multiple databases. As expected, each database contains multiple objects like tables, views, stored procedures, etc.

    Redshift是一个集群仓库,每个集群可以容纳多个数据库。 不出所料,每个数据库都包含多个对象,例如表,视图,存储过程等。

    节点和切片 (Nodes and Slices)

    Knowing that Redshift is a distributed and clustered service, it is logical to expect that the data tables are stored across multiple nodes.

    知道Redshift是分布式和群集服务,因此逻辑上期望数据表存储在多个节点上是合乎逻辑的。

    A node is a compute unit with dedicated CPUs, memory and disk. Redshift has two types of nodes: Leader and Compute. The Leader node manages data distribution and query execution across Compute nodes. Data is stored on Compute nodes only.

    节点是具有专用CPU,内存和磁盘的计算单元。 Redshift有两种类型的节点:Leader和Compute。 Leader节点管理跨Compute节点的数据分发和查询执行。 数据仅存储在Compute节点上。

    Image for post
    Leader and Compute Nodes
    领导者和计算节点

    To understand how Redshift distributes data, we need to know some details about compute nodes.

    要了解Redshift如何分发数据,我们需要了解有关计算节点的一些详细信息。

    Slice is logical partition for disk storage. Each node has multiple slices which allow parallel access and processing across slices on each node.

    切片是磁盘存储的逻辑分区。 每个节点具有多个片,这些片允许在每个节点上的片之间进行并行访问和处理。

    The number of slices per node depends on the node instance types. Redshift currently offers 3 families of instances: Dense Compute(dc2), Dense Storage (ds2) , and Managed Storage(ra3). The slices can range from 2 per node to 16 per node depending on the instance family and instance type; see this for details. The objective of this concept is to distribute the workload of queries evenly across all nodes to leverage the parallel compute and increase efficiency. As such, the default behaviour is to distribute the data evenly across all slices on all nodes when it is loaded into a table as shown below.

    每个节点的切片数取决于节点实例类型。 Redshift当前提供3个实例系列:密集计算( dc2 ),密集存储( ds2)和托管存储( ra3 )。 切片的范围可以从每个节点2个到每个节点16个,具体取决于实例系列和实例类型。 详情请参阅 。 此概念的目的是将查询的工作负载平均分配到所有节点上,以利用并行计算并提高效率。 这样,默认行为是将数据加载到表中后,将其均匀分布在所有节点上的所有片上,如下所示。

    Image for post
    Nodes and Slices with table distribution
    具有表分布的节点和切片

    Each slice stores multiple tables in 1MB blocks. This system of slices and nodes achieves two objectives:

    每个片以1MB的块存储多个表。 切片和节点的系统实现了两个目标:

    1. Distribute data and compute evenly across all compute nodes.

      在所有计算节点之间平均分配数据并进行计算。

    2. Colocate data and compute minimizing data transfer and increasing join efficiency across nodes.

      共置数据和计算,可最大程度地减少数据传输并提高跨节点的联接效率。

    柱状储存 (Columnar Storage)

    One key feature of Redshift that influences the compute is the columnar storage of data. In addition to the architecture and design for query efficiency, the data itself is stored in a columnar format. The majority of analytical queries will utilise a small number of columns from a table for any aggregations. Without going into details, data is stored by columns rather than rows. This presents multiple advantages for Redshift.

    Redshift影响计算的一项关键功能是数据的列式存储。 除了查询效率的体系结构和设计之外,数据本身还以列格式存储。 大多数分析查询将利用表中的少量列进行任何汇总。 无需详细说明,数据按列而不是行存储。 这为Redshift带来了多个优势。

    Disk I/O is reduced significantly as only the necessary data are accessed. This means the query performance is inversely correlated with the amount of data being accessed and the number of columns in a table does not factor into disk I/O cost. A query selecting 5 columns out of 100 column table only has to access 5% of the data block space.

    由于仅访问必需的数据,因此磁盘I / O大大减少了。 这意味着查询性能与正在访问的数据量成反比,并且表中的列数不影响磁盘I / O成本。 从100列表中选择5列的查询只需访问5%的数据块空间。

    Each block of data contains values from a single column. This means the data type within each block is always the same. Redshift can apply specific and appropriate compression on each block increasing the amount of data being processed within the same disk and memory space. Using 1MB block size increases this efficiency in comparison with other databases which use several KB for each block.

    每个数据块都包含来自单个列的值。 这意味着每个块内的数据类型始终相同。 Redshift可以在每个块上应用特定且适当的压缩,以增加在同一磁盘和内存空间内正在处理的数据量。 与其他数据库每个块都使用几个KB的数据库相比,使用1MB块大小可以提高此效率。

    Overall, due to compression, the large block size and columnar storage, Redshift can process data in a highly efficient manner scaling with increasing data usage. Understanding this a database developer can write optimal queries avoiding select * as with OLTP databases.

    总体而言,由于压缩,大块大小和列存储,Redshift可以以高效的方式处理数据,并随着数据使用量的增加而扩展。 了解这一点,数据库开发人员可以编写最佳查询,从而避免像OLTP数据库那样select *

    工作量管理 (Workload Management)

    So far, data storage and management have shown significant benefits. Now it is time to consider management of queries and workloads on Redshift. Redshift is a data warehouse and is expected to be queried by multiple users concurrently and automation processes too. Workload Management (WLM) is a way to control the compute resource allocation to groups of queries or users. Through WLM, it is possible to prioritise certain workloads and ensure the stability of processes.

    到目前为止,数据存储和管理已显示出显着的优势。 现在是时候考虑在Redshift上管理查询和工作负载了。 Redshift是一个数据仓库,并且有望同时由多个用户和自动化过程查询。 工作负载管理(WLM)是一种控制计算资源分配给查询或用户组的方式。 通过WLM,可以对某些工作负载进行优先级排序并确保流程的稳定性。

    WLM allows defining “queues” with specific memory allocation, concurrency limits and timeouts. Each query is executed via one of the queues. When a query is submitted, Redshift will allocate it to a specific queue based on the user or query group. There are some default queues that cannot be modified such as for superuser, vacuum maintenance and short queries (<20sec). WLM queues are configurable, however, Amazon provides an alternative which is a fully managed WLM mode called “Auto WLM”. In the “Auto WLM” mode, everything is managed by Redshift service including concurrency and memory management.

    WLM允许使用特定的内存分配,并发限制和超时来定义“队列”。 每个查询都是通过队列之一执行的。 提交查询后,Redshift会根据用户或查询组将其分配到特定队列。 有一些默认队列无法修改,例如超级用户,真空维护和短查询(<20sec)。 WLM队列是可配置的,但是Amazon提供了一种替代方案,即完全托管的WLM模式,称为“自动WLM”。 在“自动WLM”模式下,一切都由Redshift服务管理,包括并发和内存管理。

    Understanding the Redshift architecture is key to reaping its benefits. Redshift is usually misunderstood as yet another database engine because engineers/analysts lack this knowledge. The architecture can be leveraged to deliver very high throughput queries and huge data processing.

    了解Redshift架构是获得其收益的关键。 Redshift通常被误解为另一个数据库引擎,因为工程师/分析师缺乏此知识。 可以利用该架构来提供非常高的吞吐量查询和大量数据处理。

    翻译自: https://towardsdatascience.com/amazon-redshift-architecture-b674513eb996

    redshift

    展开全文
  • Redshift Console的目标是成为监视和管理Redshift集群的工具。 第一个版本具有监视正在运行的查询,WLM队列和表/方案的基本工具。 经过一年多的脚本和查询管理我们的Redshift集群后,我们决定将其捆绑到一个更加...
  • aws redshift 本文概述了AWS Redshift,并逐步介绍了创建Redshift集群的方法。 介绍 AWS Redshift是AWS云上的列式数据仓库服务,可以扩展到PB级存储,用于托管该仓库的基础架构由AWS云完全管理。 Redshift在具有...

    aws redshift

    This article gives you an overview of AWS Redshift and describes the method of creating a Redshift Cluster step-by-step.

    本文概述了AWS Redshift,并逐步介绍了创建Redshift集群的方法。

    介绍 (Introduction)

    AWS Redshift is a columnar data warehouse service on AWS cloud that can scale to petabytes of storage, and the infrastructure for hosting this warehouse is fully managed by AWS cloud. Redshift operates in a clustered model with a leader node, and multiple worked nodes, like any other clustered or distributed database models in general. It is based on Postgres, so it shares a lot of similarities with Postgres, including the query language, which is near identical to Structured Query Language (SQL). This Redshift supports creating almost all the major database objects like Databases, Tables, Views, and even Stored Procedures. In this article, we will explore how to create your first Redshift cluster on AWS and start operating it.

    AWS Redshift是AWS云上的列式数据仓库服务,可以扩展到PB级存储,用于托管该仓库的基础架构由AWS云完全管理。 Redshift在具有领导者节点和多个工作节点的集群模型中运行,通常与任何其他集群或分布式数据库模型一样。 它基于Postgres,因此与Postgres有很多相似之处,包括查询语言,与结构化查询语言(SQL)几乎相同。 该Redshift支持创建几乎所有主要的数据库对象,如数据库,表,视图,甚至存储过程。 在本文中,我们将探索如何在AWS上创建第一个Redshift集群并开始对其进行操作。

    先决条件 (Pre-requisites)

    An AWS account with the required privileges is required to use the AWS Redshift service. To create an AWS account, you would need to have a credit card or a payment method supported by AWS. First-time users who intend to open a new AWS account can read this article, which explains the process of opening and activating a new AWS account.

    要使用AWS Redshift服务,需要具有必需特权的AWS账户。 要创建一个AWS账户,您需要拥有AWS支持的信用卡或付款方式。 打算开设新的AWS账户的初次用户可以阅读本文 ,其中介绍了开设和激活新的AWS账户的过程。

    Once you have a new AWS account, AWS offers many services under free-tier where you receive a certain usage limit of specific services for free. New account users get 2-months of Redshift free trial, so if you are a new user, you would not get charged for Redshift usage for 2 months for a specific type of Redshift cluster.

    拥有新的AWS帐户后,AWS将免费提供许多服务,您可以免费获得特定服务的特定使用限制。 新帐户用户可获得2个月的Redshift 免费试用 ,因此,如果您是新用户,则对于特定类型的Redshift群集,您将在2个月内无需支付Redshift使用费。

    创建您的第一个AWS Redshift集群 (Creating your first AWS Redshift Cluster)

    It is assumed that the reader has an AWS account and required administrative privileges to operate on Redshift. If you are a new user, it is highly probable that you would be the root/admin user and you would have all the required permissions to operate anything on AWS. Once you log on to AWS using your user credentials (user id and password), you would be shown the landing screen which is also called the AWS Console Home Page.

    假定读取器具有一个AWS账户并具有在Redshift上进行操作所需的管理权限。 如果您是新用户,则极有可能您将成为root / admin用户,并且您将拥有在AWS上进行任何操作所需的所有权限。 使用用户凭证(用户ID和密码)登录到AWS后,将显示登录屏幕,该屏幕也称为AWS控制台主页。

    In AWS cloud, almost every service except a few is regional services, which means that whatever you create in the AWS cloud is created in the region selected by you. The default region in AWS in N. Virginia which you can see in the top-right corner. If you wish to create your Redshift cluster in a different region, you can select the region of your choice. You can learn more about AWS regions from this article. After selecting the region of your choice, the next step is to navigate to the AWS Redshift home page. Type Redshift on the search console as shown below, and you would find the service name listed.

    在AWS云中,除少数服务外,几乎所有服务都是区域服务,这意味着您在AWS云中创建的任何内容都将在您选择的区域中创建。 您可以在右上角看到弗吉尼亚州北部AWS中的默认区域。 如果要在其他区域中创建Redshift集群,则可以选择所需的区域。 您可以从本文了解有关AWS区域的更多信息。 选择所需区域后,下一步是导航至AWS Redshift主页。 如下所示,在搜索控制台上键入Redshift,您将找到列出的服务名称。

    Searching Redshift from AWS Console

    Click on the service name and you would be navigated to the home page or the dashboard page of Redshift as shown below.

    单击服务名称,您将被导航到Redshift的主页或仪表板页面,如下所示。

    Redshift Home Page

    Once you are on the home page of AWS Redshift, you would find several icons on the left page which offers options to operate on various features of Redshift. To get started, we need to create a cluster first, then log on to the cluster to create database objects in it. On the right-hand side of the screen, you would find a button named Create Cluster as shown above. Click this button to start specifying the configuration using which the cluster would be built.

    进入AWS Redshift的主页后,您会在左侧页面上找到几个图标,这些图标提供用于操作Redshift的各种功能的选项。 首先,我们需要先创建一个集群,然后登录到该集群以在其中创建数据库对象。 在屏幕的右侧,您将找到一个名为Create Cluster的按钮,如上所示。 单击此按钮开始指定用于构建集群的配置。

    集群配置 (Cluster Configuration)

    Cluster Configuration

    Once you are on the cluster creating wizard, you would need to provide different details to determine the configuration of your AWS Redshift cluster. Firstly, provide a cluster name of your choice. The next detail is Node Type – which determines the capacity of nodes in your cluster. DC2 stands for Dense Compute Nodes, DS2 stands for Dense Storage and RA3 is the most advanced and latest offering from Redshift which offers the most powerful nodes having a very large compute and storage capacity. By default, it would be shown as the recommended option. But for first-time users who are just getting started with Redshift, they often do not need such high capacity nodes, as this can incur a lot of cost due to the capacity associated with it. DC2 usage is covered in the free-tier and it offers a very reasonable configuration at an affordable cost for modest data volumes. So, select dc2.large node type which offers 160 GB of storage per node. You can read more about Redshift node types from here.

    进入集群创建向导后,您将需要提供其他详细信息来确定AWS Redshift集群的配置。 首先,提供您选择的集群名称。 下一个详细信息是“节点类型”-它确定集群中节点的容量。 DC2代表密集计算节点,DS2代表密集存储,而RA3是Redshift提供的最先进和最新的产品,它提供了具有非常大的计算和存储容量的最强大的节点。 默认情况下,它将显示为推荐选项。 但是对于刚开始使用Redshift的首次用户,他们通常不需要如此高容量的节点,因为与之相关的容量可能会导致大量成本。 DC2的使用在免费层中涵盖,并且它以合理的价格提供了非常合理的配置,以适中的数据量。 因此,选择dc2.large节点类型,每个节点可提供160 GB的存储空间。 您可以从此处阅读有关Redshift节点类型的更多信息。

    The next step is to select the number of nodes in a cluster. We can create a single node cluster, but that would technically not count as a cluster, so we would consider a 2-node cluster. The default value for the number of nodes is 2, which you can change as required. Below are the number of nodes, it shows that the cost of running this cluster for the entire month is $320. It’s recommended to terminate the cluster once the cluster is not in use. The cluster creating process is very concise and it hardly takes minutes to create or terminate a cluster. You can either pause/terminate a cluster when not required depending upon your use-case. First-time users are covered under free tier, so they would not get charged anything for Redshift usage of DC2 2-node cluster for a couple of hours.

    下一步是选择集群中的节点数。 我们可以创建一个单节点集群,但是从技术上讲,这不会算作集群,因此我们将考虑一个2节点集群。 节点数的默认值为2,可以根据需要更改。 下面是节点数,它表明整个月运行此群集的成本为$ 320。 建议在不使用群集时终止群集。 集群创建过程非常简洁,创建或终止集群几乎不需要几分钟。 您可以根据需要,在不需要时暂停/终止集群。 首次使用的用户可以享受免费套餐的服务,因此他们在几个小时内都不会因Redshift使用DC2 2节点群集而获得任何费用。

    数据库配置 (Database Configuration)

    Redshift Database Configuration

    The next step is to specify the database configuration. The default database name is dev and default port on which AWS Redshift listens to is 5439. You can change this configuration as needed or use the default values. In this case, we would be using the default values.

    下一步是指定数据库配置。 默认数据库名称为dev,AWS Redshift侦听的默认端口为5439。您可以根据需要更改此配置,也可以使用默认值。 在这种情况下,我们将使用默认值。

    Redshift Database Credentials

    After specifying the database name and port, the next required detail is the master username and password, which is the administrative credential that provides full access to the AWS Redshift cluster. The default username is an awsuser. Provide a password of your choice as per the rules mentioned below the password box. This completes the database level configuration of Redshift.

    指定数据库名称和端口后,下一个必需的详细信息是主用户名和密码,这是提供对AWS Redshift集群的完全访问权限的管理凭证。 默认用户名是awsuser。 根据密码框下方提到的规则,提供您选择的密码。 这样就完成了Redshift的数据库级配置。

    其他配置 (Additional Configurations)

    Redshift Additional Configuration

    Cluster permissions is an optional configuration that allows specifying Identity and Access Management (IAM) roles that allow the AWS Redshift clusters to communicate/integrate with other AWS services. It can be modified even after the cluster is created, so we would not configure it for now.

    集群权限是一个可选配置,允许指定身份和访问管理(IAM)角色,以允许AWS Redshift集群与其他AWS服务进行通信/集成。 即使在创建集群之后也可以对其进行修改,因此我们暂时不对其进行配置。

    In the additional configurations section, switch off the Use Defaults switch, as we intend to change the accessibility of the cluster. We intend to use the cluster from our personal machine over an open internet connection. This is generally not the recommended configuration for production scenarios, but for first-time users who are just getting started with Redshift and do not have any sensitive data in the cluster, it’s okay to use the Redshift cluster with non-sensitive data over open internet for a very short duration. The additional configuration allows specifying details like network configuration, security, backup management, parameter and option groups that allow to control the behavior of the Redshift cluster and well as maintenance windows.

    在“其他配置”部分中,由于我们打算更改群集的可访问性,因此请关闭“使用默认值”开关。 我们打算通过开放的Internet连接从我们的个人计算机使用群集。 对于生产场景,通常不建议使用此配置,但是对于刚开始使用Redshift并且集群中没有任何敏感数据的首次用户,可以通过开放Internet将Redshift集群与非敏感数据一起使用在很短的时间内。 附加配置允许指定详细信息,例如网络配置,安全性,备份管理,参数和选项组,这些信息可以控制Redshift集群的行为以及维护窗口。

    The only option which we need to change here is the Publicly Accessible setting as shown below. The default value for this setting will be No. Change it to the value of Yes, so that it would make the necessary network changes to allow the use of AWS Redshift cluster over open internet using the cluster endpoint that would be created.

    我们需要在此处更改的唯一选项是“公共可访问”设置,如下所示。 此设置的默认值为No。将其更改为Yes,以便它将进行必要的网络更改,以允许使用将要创建的集群终端节点在开放Internet上使用AWS Redshift集群。

    Redshift Public Accessibility

    Once this configuration is complete, click on the Create Cluster button. This will start creating your cluster and you would be navigated to the clusters window, where you would find the status of your cluster in Modifying status. Do not get alarmed by the status, as you may wonder that you are just creating your cluster and instead of showing a creating/pending/in-progress status, it’s showing modifying. This is the terminology that AWS uses for creating or modifying any type of cluster.

    配置完成后,单击“创建群集”按钮。 这将开始创建您的集群,您将被导航到“集群”窗口,您将在“修改状态”中找到集群的状态。 不要对状态感到震惊,因为您可能想知道您只是在创建集群,而不是显示正在创建/正在等待/进行中的状态,而是在显示正在修改。 这是AWS用于创建或修改任何类型的集群的术语。

    Redshift Cluster Status

    Once the cluster is created you would find it in Available status as shown below.

    创建集群后,您将发现其处于可用状态,如下所示。

    Redshift Available Cluster Status

    Once you click on the Dashboard, you would find you would be able to see the statistics of the cluster, for example, 1 Cluster(s), 2 Total nodes etc. Consider exploring this page to check out more details regarding your cluster.

    单击仪表板后,您将发现可以查看集群的统计信息,例如1个集群,2个节点总数等。请考虑浏览此页面以查看有关集群的更多详细信息。

    Redshift Dashboard Page

    查询AWS Redshift集群 (Querying AWS Redshift Cluster)

    Click on the Editor icon on the left pane to connect to Redshift and fire queries to interrogate the database or create database objects. This page will require you to provide your master username and password to log on and start using the database from the browser itself, without the need to use an external IDE to operate on Redshift. Provide the details as shown below and click on Connect to database button.

    单击左窗格上的“编辑器”图标以连接到Redshift并触发查询以查询数据库或创建数据库对象。 该页面将要求您提供主用户名和密码,以从浏览器本身登录并开始使用数据库,而无需使用外部IDE在Redshift上进行操作。 提供如下所示的详细信息,然后单击“连接到数据库”按钮。

    Redshift Editor Login

    Once you successfully log on, you would be navigated to a window as shown below. The data objects list the system objects and schemas. The Query editor window facilitates firing queries against the selected schema.

    成功登录后,将导航至如下所示的窗口。 数据对象列出了系统对象和架构。 查询编辑器窗口有助于针对所选架构触发查询。

    Redshift Query Editor Window

    You can start firing DDL (Data Definition Language) and DML (Data Manipulation Language) queries from the Query Editor window as shown below. You can read more about the AWS Redshift query language from here.

    您可以从“查询编辑器”窗口开始触发DDL(数据定义语言)和DML(数据操作语言)查询,如下所示。 您可以从此处阅读有关AWS Redshift查询语言的更多信息。

    Redshift SQL Queries

    删除AWS Redshift集群 (Deleting AWS Redshift Cluster)

    Once you are done using your cluster, it is recommended to terminate the cluster to avoid incurring any cost or wastage of the free-tier usage. Navigate to the dashboard page by clicking on the dashboard icon on the left pane. Select your cluster and click on the Delete button from the Actions menu.

    一旦完成使用群集的操作,建议终止群集,以免产生任何成本或浪费免费层使用。 通过单击左窗格上的仪表板图标导航到仪表板页面。 选择您的集群,然后从“ 操作”菜单中单击“ 删除”按钮。

    Deleting Redshift Cluster

    You would be prompted with a pop-up dialog that will ask you to create a final snapshot. If you do not have any data that you want to retain in a snapshot will have an additional cost, then you can uncheck this option as shown below. Click on the Delete button and this will start the deletion process and within a minute or two the AWS Redshift cluster would get deleted.

    弹出对话框将提示您,要求您创建最终快照。 如果没有任何要保留在快照中的数据将产生额外的费用,则可以取消选中此选项,如下所示。 单击删除按钮,这将开始删除过程,一两分钟之内,AWS Redshift集群将被删除。

    Redshift Delete Confirmation

    结论 (Conclusion)

    In this article, we covered the process of creating an AWS Redshift cluster and the various details that are required for creating a cluster. We briefly understood the way to access the cluster from the browser and fire SQL queries against the cluster. And finally, once the cluster is no longer required, we learned how to delete the cluster to stop incurring any cluster usage cost.

    在本文中,我们介绍了创建AWS Redshift集群的过程以及创建集群所需的各种详细信息。 我们简要了解了从浏览器访问集群并对集群发起SQL查询的方式。 最后,一旦不再需要集群,我们就学习了如何删除集群以停止产生任何集群使用成本。

    翻译自: https://www.sqlshack.com/getting-started-with-aws-redshift/

    aws redshift

    展开全文
  • redshift 踩坑

    2020-12-19 14:30:08
    $ sudo pacman -S redshift 安装 $ redshift -l geoclue2 查询经纬 写完一定要关闭终端!!本小白在这被坑了,没意识到这也是在运行。。。多个运行屏幕会闪烁 root mkdir /home/{你的用户名}/.config/redshift vim...

    参考:https://blog.csdn.net/u014025444/article/details/91488957

    $ sudo pacman -S redshift
    

    安装

    $ redshift -l geoclue2
    

    查询经纬
    写完一定要关闭终端!!本小白在这被坑了,没意识到这也是在运行。。。多个运行屏幕会闪烁

    root

    mkdir /home/{你的用户名}/.config/redshift
    vim /home/{你的用户名}/.config/redshift/redshift.conf
    
    ; Global settings for redshift
    [redshift]
    ; Set the day and night screen temperatures
    temp-day=5500
    temp-night=3700
    
    ; Enable/Disable a smooth transition between day and night
    ; 0 will cause a direct change from day to night screen temperature.
    ; 1 will gradually increase or decrease the screen temperature.
    transition=1
    
    ; Set the screen brightness. Default is 1.0.
    ;brightness=0.9
    ; It is also possible to use different settings for day and night
    ; since version 1.8.
    ;brightness-day=0.7
    ;brightness-night=0.4
    ; Set the screen gamma (for all colors, or each color channel
    ; individually)
    gamma=1.0
    ;gamma=0.8:0.7:0.8
    ; This can also be set individually for day and night since
    ; version 1.10.
    ;gamma-day=0.8:0.7:0.8
    ;gamma-night=0.6
    
    ; Set the location-provider: 'geoclue2', 'manual'
    ; type 'redshift -l list' to see possible values.
    ; The location provider settings are in a different section.
    location-provider=manual
    
    ; Set the adjustment-method: 'randr', 'vidmode' 
    ; type 'redshift -m list' to see all possible values.
    ; 'randr' is the preferred method, 'vidmode' is an older API.
    ; but works in some cases when 'randr' does not.
    ; The adjustment method settings are in a different section.
    adjustment-method=randr
    
    ; Configuration of the location-provider:
    ; type 'redshift -l PROVIDER:help' to see the settings.
    ; ex: 'redshift -l manual:help'
    ; Keep in mind that longitudes west of Greenwich (e.g. the Americas)
    ; are negative numbers.
    [manual]
    lat=23.14
    ;这两行就是上面查到的那两个(这两行后面不能有空格我在这也被坑了,小白落泪
    lon=113.25
    
    ; Configuration of the adjustment-method
    ; type 'redshift -m METHOD:help' to see the settings.
    ; ex: 'redshift -m randr:help'
    ; In this example, randr
    ; Note that the numbering starts from 0, so this is actually the
    ; second screen. If this option is not specified, Redshift will try
    ; to adjust _all_ screens.
    [randr]
    screen=0
    

    启动就完事了,2020/12/19。

    展开全文
  • Redshift 简介

    千次阅读 2019-08-13 20:38:07
    Redshift是AWS提供的关系型数据库管理系统(RDBMS),基于PostgreSQL,但二者的差别非常大。可以通过自带的Redshift客户端连接Redshift数据仓库,也可以使用第三方的SQL WorkbenchJ来连接。 Redshift架构 Redshift...
  • RedShift-开源

    2021-08-03 14:08:50
    一种类似于 SHIFT 和 Lambda-SHIFT 的混合系统仿真语言,用 Ruby 和 C 编写。结合了数值积分、状态机和 Ruby 的强大功能。 安装:获取 ruby​​,然后“gem install redshift”。
  • Redshift 数据库手册

    2019-05-10 11:36:09
    This is the Amazon Redshift Database Developer Guide. Amazon Redshift is an enterprise-level, petabyte scale, fully managed data warehousing service. This guide focuses on using Amazon Redshift to ...
  • Amazon Redshift 实用程序 Amazon Redshift 是一种快速、完全托管的 PB 级数据仓库解决方案,它使用列式存储来最大限度地减少 IO、提供高数据压缩率和快速性能。 此 GitHub 提供了一系列脚本和实用程序,可帮助您从 ...
  • 适用于Apache Spark的Redshift数据源 笔记 为了确保为我们的客户提供最佳体验,我们决定直接在Databricks Runtime中内联此连接器。 Databricks Runtime(3.0+)的最新版本包括适用于Spark的RedShift连接器的高级版本...
  • redshiftRedshift is awesome, until it stops being ... Usually, it isn’t so much Redshift’s fault when that happens. One of the most common problems that people using Redshift face is of bad query pe...
  • Amazon Redshift-基础知识 关于Amazon Redshift基础的简短文章 建造 epub / html:运行build.bat(使用pandoc) mobi:运行epub-to-mobi.bat(使用kindlegen)
  • 红移记录 使用 AWS Redshift 设置日志分析的脚本
  • Add Redshift Adapter

    2021-01-08 12:32:19
    <div><ul>✨ Adds support for Redshift ✨ </li><li>Run tests with <code>REDSHIFT_URL=redshift_url rake test:redshift</code></li></ul><p>该提问来源于开源项目:ankane/groupdate</p></div>
  • pandas_redshift 该软件包旨在使从redshift到pandas DataFrame的数据获取更为容易,反之亦然。 pandas_redshift软件包仅支持python3。 安装 pip install pandas - redshift 例子 import pandas_redshift as pr ...
  • 适用于Django的Redshift数据库后端 这是的数据库后端。 文献资料 Django设置 用于数据库的引擎为“ django_redshift_backend”。 您可以在settings.py中将名称设置为: DATABASES = { 'default': { 'ENGINE': '...
  • Redshift ::路轨 该库提供的railtie允许redshift-client进入Rails> = 4。 安装 将此行添加到应用程序的Gemfile中: gem 'redshift-rails' 然后执行: $ bundle 或将其自己安装为: $ gem install redshift-...
  • 导航 ORM 即将推出的功能 执照 概述 该软件包是使用Redshift时所需的常用功能的简单包装。...var Redshift = require ( 'node-redshift' ) ; var client = { user : user , database : database , passwo
  • redshift-copy-script 用于在Redshift中运行COPY命令的小Python脚本
  • redshift设置

    2020-08-07 17:58:38
    安装完成后,有两种启动方法 1.终端输入redshift-gtk -l 39.90:116.40 -t 5000:3500 这条命令的意思是手动将经纬度设定在’39.90:116.40’(北京),并且调整白天色温为 5000K,夜晚 3500K。
  • redshift udfPython UDFs in Amazon Redshift Amazon Redshift中的Python UDF 翻译自: https://www.pybloggers.com/2016/09/python-udfs-in-amazon-redshift/redshift udf
  • redshift-fake-driver redshift-fake-driver是一个JDBC驱动程序,它接受Redshift特定的命令(例如COPY,UNLOAD),这对于本地测试很有用。 该驱动程序使用AWS SDK for Java连接到S3,因此您可以使用模拟库来模拟S3...
  • 亚马逊redshift-s3-data-integration Amazon Redshift与Amazon S3的数据集成
  • redshift_csv-源码

    2021-05-06 14:51:38
    一个将redshift查询输出到csv的gem。 安装 将此行添加到您的应用程序的Gemfile中: gem 'redshift_csv' 然后执行: $ bundle 或将其自己安装为: $ gem install redshift_csv 用法 RedshiftCsv . config ( ...
  • 一个简单而强大的工具,可将您的数据从Redshift迁移到Redshift Spectrum。 免费软件:MIT许可证 文档: : 。 特征 一线客: 将Redshift表导出到S3(CSV) 并行将导出的CSV转换为Parquet文件 在Redshift集群上...
  • redshift-utils-源码

    2021-07-01 22:46:22
    redshift-utils redshift 的数据加载实用程序 输入:CSV(s)、s3 存储桶名称、表名称、Redshift 服务器、数据库、凭据 脚步: 生成创建表语句(使用 csvsql): head -n 500000 filename | csvsql -i postgresql -e...
  • 基于零管理AWS Lambda的Amazon Redshift数据库加载程序 借助此AWS Lambda函数,将文件数据导入Amazon Redshift从未如此简单。 您只需将文件拖放到Amazon S3上的预配置位置,此功能就会自动加载到您的Amazon Redshift...
  • Redshift 管理脚本 以下是我们在编写的一些脚本; 他们帮助我们管理我们负责的集群。 Redshift 启动/停止脚本 bin/redshift.sh脚本确保在重启时保存集群的配置和数据。 事实上,停止一个 redshift 集群更像是一个...
  • I'm trying to connect to Redshift database from SQL Workbench/J using Postgre JDBC drivers but I can't get through. I get this error "The connection attempt failed". The jdbc driver is properly locate...
  • 使用AWS Redshift和Ruby设置数据仓库 在 大多数初创企业最终都需要一个强大的解决方案来存储大量数据以进行分析。 就像您在Credible上所做的那样,也许您正在运行视频应用以了解用户流失的情况,或者您正在研究网站...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 4,528
精华内容 1,811
关键字:

redshift