精华内容
下载资源
问答
  • 利用实验室测量结果对不同的测量技术进行比较,并阐述了如何可靠地*估参考时钟发生器的电源噪声抑制(PSNR)性能。  基于PLL的时钟发生器被广泛用于网络设备,用来产生高精度、低抖动参考时钟或保持网络同步工作。...
  • SAR图像斑点噪声抑制方法与应用研究,杨红磊,彭军还,合成孔径雷达图像固有的相干斑噪声严重降低了图像的可解译程度,影响了后续目标检测、分类和识别等应用。一个理想的去斑算法应该�
  • 因此本文主要从智能传感器系统的噪声来源和噪声抑制技术进行论述。 智能传感器系统的噪声来源主要分为两大类,即内部噪声和外部噪声。内部噪声主要有导电体内部的电子无规则运动而产生的高低频噪声,电磁元件的电磁...
  • 双向可控硅(又称三端双向晶闸管)一直用于控制交流负载,几乎在所有电器上都能看到双向可控硅。
  • RNNoise:RNN(音频)噪声抑制学习
  • 本文探讨了时钟公差对Σ-ΔADC中低通抽样和数字滤波器的...窄带Σ-Δ应用通常利用数字滤波器提供50Hz、60Hz或50Hz/60Hz的噪声抑制。在选择外部时钟晶体或内部时钟时,了解时钟频率与数字滤波器特性之间的关系非常重要。
  • 深度噪声抑制(DNS)挑战-INTERSPEECH 2021 该存储库包含DNS质询所需的数据集和脚本。 有关挑战的更多详细信息,请参阅我们的和挑战。 有关测试框架的更多详细信息,请访问 。 回购详情: 数据集目录包含干净的...
  • 针对这一问题,提出了一种基于微位移的数字全息散斑噪声抑制方法。在全息记录过程中,通过连续微小平移物体,记录多帧数字全息图。对多帧全息图分别进行数字再现。在再现计算中,利用像差校正算法和图像配准算法,...
  • 磁珠-电感-电阻-电容---于噪声抑制上之剖析与探讨;上中下三部分;作者大厂fae,工程经历丰富,实践结合理论,非常难得。
  • 开关噪声抑制.pdf

    2019-09-20 15:36:07
    开关电源的开关噪声影响模拟电路工作,如何有效抑制噪声非常重要
  • speex音频噪声抑制

    2014-12-03 11:12:03
    基于speex开源库实现的噪声抑制,可以直接运行,里面有测试文件
  • 自制动态噪声抑制

    2020-10-21 00:09:40
    音响发烧友要想得到高质量放音效果,除了具备高保真音响设备之外,还要设法有效地抑制音源本身的噪声。本人通过多次试验,制作了动态躁声抑制器,使磁带、唱片、收音头等音源的本底噪声明显减小。
  • 搭建了一套光纤相位噪声抑制系统。通过环外自拍频, 得到噪声本底的秒级频率稳定度为6.8×10-18, 2000 s平均时间后达到2.3×10-19。利用该系统可实现窄线宽激光频率在1.6 km实际光纤链路中的传输, 传输后环外自拍频...
  • 磁珠(Bead)_电感(L)_电阻(R)_电容(C)于噪声抑制上的相关剖析与探讨
  • 村田噪声抑制基础教程一到六章《需要EMI静噪滤波器的原因》《产生电磁噪声的机制》《噪声问题复杂化的因素》《空间传导及其应对措施》《导体传导和共模》《EMI静噪滤波器》,中文整合版,图文并茂,内容详细,有目录...
  • 一种基于时延估计和噪声抑制的时间同步算法
  • WebRTC_NS WebRTC的噪声抑制模块端口 捐献 如果您觉得这个项目有用,请考虑给我买杯咖啡
  • 噪声抑制的方法  HDMI传输差分信号(D+和D-),差分信号的电流按相反方向流动,这将产生相反方向的磁场,因此在理想状态下能有效消除差模噪声。但如果信号线耦合了共模噪声,则共模电流产生的磁场将得到加强,从而...
  • 基于风噪声抑制的语音信号增强研究
  • 上集_磁珠(Bead)_电感(L)_电阻(R)_电容(C)于噪声抑制上之剖析与探讨
  • 基于信号完整性分析的PCB电路辐射噪声抑制方法.pdf
  • 为了获得高性能的微波噪声抑制器,采用射频磁控溅射工艺制备了FeCoB基薄膜微波噪声抑制器,并用微带线法测试了其传导噪声抑制特性。研究了薄膜噪声抑制器的工艺参数和几何尺寸对其传导噪声抑制特性的影响。结果表明...
  • 深度学习 噪声抑制Credits: Zoom学分:缩放At the time of boom of online video conferencing and virtual communication, the ability of a platform to suppress background noise plays a crucial roll to give ...

    深度学习 噪声抑制

    Image for post
    Credits: Zoom
    学分:缩放

    At the time of boom of online video conferencing and virtual communication, the ability of a platform to suppress background noise plays a crucial roll to give it a leading edge. Platforms like Google Meet constantly use Machine Learning to perform noise suppression to provide the best audio quality possible. Today I will show you how you could make your own Deep Learning model to perform Noise Suppression

    在在线视频会议和虚拟通信蓬勃发展之际,平台抑制背景噪声的能力发挥了至关重要的作用,从而使其具有领先优势。 像Google Meet这样的平台不断使用机器学习来执行噪声抑制,以提供最佳的音频质量。 今天,我将向您展示如何创建自己的深度学习模型来执行噪声抑制

    抑制噪音有什么大不了的?(What’s the Big Deal with Noise Suppression?)

    The task of Noise Suppression can be approached in a few different ways. These might include Generative Adversarial Networks (GAN’s), Embedding Based Models, Residual Networks, etc. Irrespective of the approach, there are two major problems with Noise Suppression

    噪声抑制的任务可以通过几种不同的方法来解决。 这些可能包括生成对抗网络(GAN),基于嵌入的模型,残差网络等。不管采用哪种方法,噪声抑制都有两个主要问题

    1. Handling variable length audio sequences

      处理可变长度的音频序列

    2. Slow processing time resulting in lag

      缓慢的处理时间导致滞后

    I will show you some basic methods in this article on how to deal with these problems

    我将在本文中向您展示一些有关如何解决这些问题的基本方法

    开始吧(Let’s Start)

    We will first import our libraries. We will be using tensorflow. You are free to implement a PyTorch version of the same.

    我们将首先导入我们的库。 我们将使用tensorflow。 您可以自由地实现相同的PyTorch版本。

    import tensorflow as tf
    from tensorflow.keras.layers import Conv1D,Conv1DTranspose,Concatenate,Input
    import numpy as np
    import IPython.display
    import glob
    from tqdm.notebook import tqdm
    import librosa.display
    import matplotlib.pyplot as plt

    The data we will be using is a combination of Clean and Noisy audio samples of different sizes. The dataset is provided by University of Edinburgh and can be downloaded from here

    我们将使用的数据是不同大小的Clean和Noisy音频样本的组合。 数据集由爱丁堡大学提供,可从此处下载

    加载和可视化数据(Loading and Visualising the data)

    We will use tensorflow’s tf.audio module to load our data. Using tf.audio() along with tf.io.read_file() has given me 50% faster loading times as compared to librosa.load() because of tensorflow using the GPU

    我们将使用tensorflow的tf.audio模块加载我们的数据。 与librosa.load()相比,将tf.audio()与tf.io.read_file()结合使用可使我的加载时间缩短了50%,这是因为使用GPU进行了张量流

    clean_sounds = glob.glob('/content/CleanData/*')
    noisy_sounds = glob.glob('/content/NoisyData/*')
    
    
    clean_sounds_list,_ = tf.audio.decode_wav(tf.io.read_file(clean_sounds[0]),desired_channels=1)
    for i in tqdm(clean_sounds[1:]):
      so,_ = tf.audio.decode_wav(tf.io.read_file(i),desired_channels=1)
      clean_sounds_list = tf.concat((clean_sounds_list,so),0)
    
    
    noisy_sounds_list,_ = tf.audio.decode_wav(tf.io.read_file(noisy_sounds[0]),desired_channels=1)
    for i in tqdm(noisy_sounds[1:]):
      so,_ = tf.audio.decode_wav(tf.io.read_file(i),desired_channels=1)
      noisy_sounds_list = tf.concat((noisy_sounds_list,so),0)
    
    
    clean_sounds_list.shape,noisy_sounds_list.shape

    Here we load our individual audio files using tf.audio.decode_wav() and concatentate them to get two tensors named clean_sounds_list and noisy_sounds_list. This process takes about 3–4 minutes to complete and is visually represented using the tqdm loading bar

    在这里,我们使用tf.audio.decode_wav()加载我们的单个音频文件,并使其合并以获得两个名为clean_sounds_listnoisy_sounds_list的张量 此过程大约需要3-4分钟才能完成,并使用tqdm加载栏直观地表示出来

    batching_size = 12000
    
    
    clean_train,noisy_train = [],[]
    
    
    for i in tqdm(range(0,clean_sounds_list.shape[0]-batching_size,batching_size)):
      clean_train.append(clean_sounds_list[i:i+batching_size])
      noisy_train.append(noisy_sounds_list[i:i+batching_size])
    
    
    clean_train = tf.stack(clean_train)
    noisy_train = tf.stack(noisy_train)
    
    
    clean_train.shape,noisy_train.shape

    After the loading is done, we will need to make uniform splits of the one big audio waveform. Although this is not compulsory to do, our main aim is to convert this model to a tflite model which currently, does not support variable length inputs. I decided an arbitrary value of batching_size as 12000. You are free to change it but keep it as small as possible. It will be useful later.

    加载完成后,我们将需要对一个大音频波形进行均匀分割。 尽管这不是强制性的,但我们的主要目标是将该模型转换为tflite模型,该模型目前不支持可变长度输入。 我决定将batching_size的任意值设置为12000。您可以随意更改它,但请使其尽可能小。 以后会有用。

    Image for post
    Clean Audio
    干净的音频
    Image for post
    Noisy Audio
    嘈杂的音频

    For the visualising part, we use librosa’s display module which basically uses matplotlib in the backend to plot the data. On plotting the data as seen above, we can see that the noise is quite visible. The noise can be anything ranging from people and cars to dish-washing sounds.

    对于可视化部分,我们使用librosa的显示模块,该模块基本上在后端使用matplotlib绘制数据。 如上所示,在绘制数据时,我们可以看到噪声非常明显。 噪音可以是从人和汽车到洗碗碟声音的任何东西。

    创建用于流水线的tf.data.Dataset (Creating a tf.data.Dataset for pipelining)

    We will now create a very basic helper function called get_dataset() to generate a tf.data.Dataset. We choose 40000 samples for training and the remaining 5000 for testing. Again, you are free to tweak and add to this but I will not be going into the depth here as pipeline optimization is not the main goal of this article

    现在,我们将创建一个非常基本的辅助函数,称为get_dataset()来生成tf.data.Dataset。 我们选择40000个样本进行训练,其余5000个样本进行测试。 同样,您可以随意调整并添加此内容,但由于管道优化不是本文的主要目标,因此我不会在此深入探讨

    def get_dataset(x_train,y_train):
      dataset = tf.data.Dataset.from_tensor_slices((x_train,y_train))
      dataset = dataset.shuffle(100).batch(64,drop_remainder=True)
      return dataset
      
    train_dataset = get_dataset(noisy_train[:40000],clean_train[:40000])
    test_dataset = get_dataset(noisy_train[40000:],clean_train[40000:])

    创建模型(Creating the Model)

    The code for the model architecture is as follows:

    模型架构的代码如下:

    inp = Input(shape=(batching_size,1))
    c1 = Conv1D(2,32,2,'same',activation='relu')(inp)
    c2 = Conv1D(4,32,2,'same',activation='relu')(c1)
    c3 = Conv1D(8,32,2,'same',activation='relu')(c2)
    c4 = Conv1D(16,32,2,'same',activation='relu')(c3)
    c5 = Conv1D(32,32,2,'same',activation='relu')(c4)
    
    
    dc1 = Conv1DTranspose(32,32,1,padding='same')(c5)
    conc = Concatenate()([c5,dc1])
    dc2 = Conv1DTranspose(16,32,2,padding='same')(conc)
    conc = Concatenate()([c4,dc2])
    dc3 = Conv1DTranspose(8,32,2,padding='same')(conc)
    conc = Concatenate()([c3,dc3])
    dc4 = Conv1DTranspose(4,32,2,padding='same')(conc)
    conc = Concatenate()([c2,dc4])
    dc5 = Conv1DTranspose(2,32,2,padding='same')(conc)
    conc = Concatenate()([c1,dc5])
    dc6 = Conv1DTranspose(1,32,2,padding='same')(conc)
    conc = Concatenate()([inp,dc6])
    dc7 = Conv1DTranspose(1,32,1,padding='same',activation='linear')(conc)
    model = tf.keras.models.Model(inp,dc7)
    model.summary()
    Image for post

    The model is a purely Convolutional one. The goal is to find a bunch of filters so as to help minimize the background noise. To help with this, we add residual connections to help with context from the original audio sample. The idea behind this model is derived from the implementation of SEGAN network [et.al Santiago Pascual]. The idea derives from the fact that a purely convolutional network can handle multiple shape inputs with ease, leading to more flexibility. The convolutional nature forces the model to focus on the temporally close correlations throughout the model. The model can be split into two parts, the convolutions and the de-convolutions(or upsampling layers). The strided convolutional layers behave as an auto-encoder where after N layers, a reduced representation of the input is obtained. The deconvolutions perform exactly the opposite strided procedures to obtain a cleaned representation of the noisy input. The skip connections provide the required context to the deconvolution layers at every step which results in better overall results.

    该模型是纯粹的卷积模型。 目标是找到一堆滤波器,以帮助最小化背景噪声。 为了解决这个问题,我们添加了残余连接以帮助处理原始音频样本中的上下文。 该模型背后的思想源自SEGAN网络的实施[et.al Santiago Pascual ]。 该思想源于以下事实:纯卷积网络可以轻松处理多种形状输入,从而带来更大的灵活性。 卷积性质迫使模型将注意力集中在整个模型的时间紧密相关性上。 该模型可以分为两部分:卷积和反卷积(或上采样层)。 跨步的卷积层表现为自动编码器,其中在N层之后,获得了输入的简化表示。 解卷积执行完全相反的跨步过程,以获得带噪输入的清晰表示。 跳过连接在每个步骤都为反卷积层提供了所需的上下文,从而带来了更好的总体效果。

    The model is compiled with a Mean Absolute Loss

    使用平均绝对损失编译模型

    Image for post

    The choice of optimizer was difficult as, SGD,RMSprop and Adam stood pretty close. I finally went ahead with Adam just because of it’s slightly more robust nature. Some hyperparameter tweaking gave me a pretty good learning rate of 0.002

    由于SGD,RMSprop和Adam处于非常接近的位置,因此很难选择优化器。 我最终还是选择了Adam,只是因为它的特性更加强大。 一些超参数调整使我的学习率达到了0.002

    The final results:

    最终结果:

    1. Training loss: 0.0117

      训练损失:0.0117
    2. Testing loss: 0.0117

      测试损失:0.0117
    Image for post
    Visible reduction of noise
    明显减少噪音

    The results are quite pleasing but we are not done yet. We still have to handle the inference procedure for variable sized inputs. We will do that next

    结果非常令人满意,但我们尚未完成。 对于可变大小的输入,我们仍然必须处理推理过程。 接下来我们会做

    处理可变输入形状(Handling the Variable Input Shape)

    Our model is trained with a very specific input shape which is dependent on our batching_size. To allow multiple shape inputs, a simple strategy will work really well.

    我们的模型使用非常具体的输入形状进行训练,具体取决于我们的batching_size。 为了允许多种形状输入,一种简单的策略将非常有效。

    We run our model through all the splits till the (n-1)th split. Consider the example where batching_size is 12000 and your audio array is of shape (37500,). In this case we split the audio waveform into min(37500/12000) = 3 splits. The remaining part of the array will be of shape (1500,). To fix this problem we sample another frame but this time from the rear end of the array. Something like this

    我们通过所有拆分运行模型,直到第(n-1)个拆分为止。 考虑以下示例,其中batching_size为12000,并且音频阵列的形状为(37500,)。 在这种情况下,我们将音频波形分割为min(37500/12000)= 3分割。 数组的其余部分将具有形状(1500,)。 为了解决这个问题,我们从阵列的后端取样另一个帧,但这一次。 像这样

    Image for post
    Overlapped Frames
    重叠的镜框
    1. Now, we run all the 4 splits through the model to get individual predictions.

      现在,我们对模型进行所有4个拆分,以获取单独的预测。
    2. From the output predictions, we extract the first three frames as they are and clip the last frame to get only the remaining part

      从输出预测中,我们按原样提取前三个帧,然后裁剪最后一帧以仅获取剩余部分

    At this point, some code will help with clarity

    在这一点上,一些代码将有助于澄清

    def get_audio(path):
      audio,_ = tf.audio.decode_wav(tf.io.read_file(path),1)
      return audio
    
    
    def inference_preprocess(path):
      audio = get_audio(path)
      audio_len = audio.shape[0]
      batches = []
      for i in range(0,audio_len-batching_size,batching_size):
        batches.append(audio[i:i+batching_size])
    
    
      batches.append(audio[-batching_size:])
      diff = audio_len - (i + batching_size)  # Calculation of length of remaining waveform
      return tf.stack(batches), diff
    
    
    def predict(path):
      test_data,diff = inference_preprocess(path)
      predictions = model.predict(test_data)
      final_op = tf.reshape(predictions[:-1],((predictions.shape[0]-1)*predictions.shape[1],1))  # Reshape the array to get complete frames
      final_op = tf.concat((final_op,predictions[-1][-diff:]),axis=0)  # Concat last, incomplete frame to the rest
      return final_op

    好吧,那有多快?(Okay but how fast is it?)

    %%timeit
    tf.squeeze(predict(noisy_sounds[3]))OUTPUT: 10 loops, best of 3: 31.3 ms per loop

    If we specify the input shape of the model as (None,1), we can pass a variable length tensor to the model which gives even faster results. For now, we want to quantize the model for cross device compatibility.

    如果我们将模型的输入形状指定为(None,1),则可以将可变长度张量传递给模型,从而获得更快的结果。 目前,我们想对模型进行量化以实现跨设备兼容性。

    TFLite模型的优化和创建 (Optimization and Creation of TFLite Model)

    Using the TFLiteConverter() is pretty straightforward. You pass the keras model along with an optimization strategy (TF documentation recommends using DEFAULT only) and write the converted model to a binary file for future use.

    使用TFLiteConverter()非常简单。 您将keras模型与优化策略(TF文档建议仅使用DEFAULT)一起传递,并将转换后的模型写入二进制文件以备将来使用。

    lite_model = tf.lite.TFLiteConverter.from_keras_model(model)
    lite_model.optimizations = [tf.lite.Optimize.DEFAULT]
    tflite_model_quant = lite_model.convert()
    
    
    with open('TFLiteModel.tflite','wb') as f:
      f.write(tflite_model_quant)

    TFLite模型推论(TFLite Model Inference)

    The preprocessing is similar to that of the Keras model but since I could not find anything on batching for TFLite models (please let me know if the support is present), I had to use a pythonic for loop for iterating over all splits. The code below describes instantiation of the Interpreter and allocation of tensors, followed by invoking it to get us our results

    预处理与Keras模型类似,但是由于我在批处理中找不到TFLite模型的任何内容(请告诉我是否存在支持),因此我不得不使用pythonic for循环遍历所有拆分。 下面的代码描述了解释器的实例化和张量的分配,然后调用它来获得我们的结果

    # Initializing the Interpreter and allocating tensors
    interpreter = tf.lite.Interpreter(model_path='/content/TFLiteModel.tflite')
    interpreter.allocate_tensors()
    
    
    def predict_tflite(path):
      test_audio,diff = inference_preprocess(path)
      input_index = interpreter.get_input_details()[0]["index"]
      output_index = interpreter.get_output_details()[0]["index"]
    
    
      preds = []
      for i in test_audio:
        interpreter.set_tensor(input_index, tf.expand_dims(i,0))  # We will have to pass individual splits since tflite doesn't support batching at this moment
        interpreter.invoke()
        predictions = interpreter.get_tensor(output_index)
        preds.append(predictions)
    
    
      predictions = tf.squeeze(tf.stack(preds,axis=1))
      final_op = tf.reshape(predictions[:-1],((predictions.shape[0]-1)*predictions.shape[1],1))
      final_op = tf.concat((tf.squeeze(final_op),predictions[-1][-diff:]),axis=0)
      return final_op

    Now the question arises, how better is this than the Keras model? The answer to that isn’t simple. Since I was unable to to process all batches together, the overall inference time was affected but the TFLite Model on it’s own is faster than the Keras model.

    现在出现了问题,这比Keras模型好吗? 答案并不简单。 由于我无法同时处理所有批次,因此总体推理时间受到影响,但是TFLite模型本身比Keras模型要快。

    %%timeit
    predict_tflite(noisy_sounds[3])OUTPUT: 10 loops, best of 3: 41.7 ms per loop

    Out of all the advantages of the TFLite format, cross platform deployment is the biggest one. The model can now be ported much easily than the Keras model, not to mention the super small size of the model- just 346 kB

    在TFLite格式的所有优势中,跨平台部署是最大的优势。 与Keras模型相比,该模型现在可以轻松移植,更不用说模型的超小尺寸-仅346 kB

    Image for post
    Plot for TFLite Model
    TFLite模型图

    现在就这样!(That’s it for now!)

    The model can be improved further by addition of filters, creating a deeper model and optimizing the pipeline but that is for next time. As we come to the end of the article I would like to mention some references and links.

    可以通过添加过滤器,创建更深的模型并优化管道来进一步改善模型,但这是下一次。 当我们到本文结尾时,我想提及一些参考和链接。

    1. Colab Notebook for the code and audio samples: Here

      Colab Notebook的代码和音频示例:这里

    2. Dataset: Here

      数据集:此处

    3. SEGAN paper: Here

      SEGAN纸:在这里

    Any comments or suggestions would be much appreciated. Thanks for reading!

    任何意见或建议将不胜感激。 谢谢阅读!

    翻译自: https://medium.com/analytics-vidhya/noise-suppression-using-deep-learning-6ead8c8a1839

    深度学习 噪声抑制

    展开全文
  • 该技术通过PCM编码对模拟语音信号数字化,再以CPLD器件进行数字化噪声抑制处理,然后解码为语音输出,从而得到优良的语音噪声抑制效果,并可通过软件调节噪声抑制参数。还以应用实例介绍了电路原理,说明了设计要点...
  • 中压电力线通信自适应OFDM系统背景噪声抑制技术研究
  • 针对远距离数字全息成像中波前畸变和散斑噪声对成像质量的影响,实验研究了基于图像指标优化的波前校正技术,以及基于孔径分割的多帧图像平均散斑噪声抑制方法。建立了数字离轴全息实验装置,针对系统自身像差,采用...
  • 距离像提取和噪声抑制是激光雷达的关键技术。分别对目标模型进行无噪声和含噪声激光雷达的距离像提取仿真研究,分析噪声对距离像提取结果的影响。同时,对像元数为64×64的距离像噪声分别使用均值滤波、中值滤波和...
  • 基于小波的医学超声图像斑点噪声抑制方法(论文)
  • 由于具有独特的物理特性,可饱和电感在高频开关电源的开关噪声抑制技术及大电流输出辅路稳压技术等方面也得到了日益广泛的应用。  2 可饱和电感的基本物理特性  图1(a)和图1(b)分别是普通铁氧体电感和可饱和...
  • 在这类系统中,信号要传输很长的距离,噪声抑制能力成为一个重要考虑因素。噪声会耦合进信号中,结果使数据遭到破坏,由此产生不良影响。系统需要得到适当的保护,了解预期噪声的量和性质可以明确需要采取的保护措施...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 29,935
精华内容 11,974
关键字:

噪声抑制