The following are my personal thought about what makes a senior data scientist (SDS) or in other words, what it takes for a junior to become one. I guess some of these ideas could be generalized to other fields as well. However, my personal interest lies in the data science domain.
以下是我个人对高级数据科学家(SDS)的创造力或换句话说，大三成为一名需要什么的个人想法。 我想其中一些想法也可以推广到其他领域。 但是，我个人的兴趣在于数据科学领域。
A senior data-scientist should be independent.
Independence is the key-word here, but what does it mean? Independence contains various qualities and skills that one should acquire in order to advance in their career.
In general terms, The SDS should be totally independent in terms of research, from creating data and labeling, to delivering a finished project based on the product design, i.e., end-to-end.
The SDS should understand modern product management and how to manage the project’s expectations while collaborating with stakeholders such as a PM. They should give their manager peace of mind and should be someone that everyone on the team goes to consult with.
Let’s break it down to several key areas that the SDS should be skilled at, such as personal, academic, technical, managerial, product, and business.
The SDS should have can-do-attitude and they should always strive to accept more responsibility, to use good judgment, and to do things beyond what is expected of them.
To be proactive and always move forward, constantly improving and learning, to have complete and in-depth answers about critical issues and various use-cases, while considering all the possible pros and cons and fully understanding decision outcomes.
Being critical, giving feedback that has a positive quality to it and is able to persuade your colleagues, while keeping in mind that feedback is not enough and they should always suggest a better alternative, in other words, saying that something is “not good enough” is insufficient.
The SDS should be a thinker, to care for and understand the meaning of a true scientific process while balancing the product need to have a feature in production without compromising their scientific integrity.
To know that assumptions are the root of all evil and revisit them when conditions change. To care about complete and correct validation, monitoring, explainability, and interpretability. To be discontent with an “it works” mentality. To accept scientific feedback and be highly critical of their own decisions and actions, which leads us back to validating assumptions and hitting the brakes often in order to be critical of his own actions, which leads to validating assumptions, constantly looking at the data to be sure that what they are saying is correct.
要知道假设是所有邪恶的根源，请在情况变化时重新审视它们。 要关心完整和正确的验证，监视，可解释性和可解释性。 对“可行”的心态感到不满。 接受科学反馈并对他们自己的决定和行动高度批评，这使我们回到验证假设并经常踩刹车，以便对他自己的行动提出批评，从而导致对假设的确认，不断地查看要被收集的数据。确保他们说的是正确的。
To come with out-of-the-box solutions, being a macgyver, and a lego master building algorithms using various scientific building blocks in order to find creative solutions for any challenge that is presented to them, regardless of the domain, the data, or the difficulty level. To help your colleagues by sharing knowledge, ideas, papers, solutions. To talk fluently about the topics they know, to coherently present their work, and to pass knowledge effortlessly to non-DS and juniors, and spot methodological problems in a short conversation.
为了提供开箱即用的解决方案，例如macgyver，以及使用各种科学构造块的乐高积木大师构建算法，以便为面临的挑战找到创造性的解决方案，而不受域，数据，或难度级别。 通过共享知识，想法，论文，解决方案来帮助您的同事。 流利地谈论他们知道的主题，连贯地介绍他们的工作，并轻松地将知识传递给非DS和初中生，并在简短的对话中发现方法问题。
They should read literature such as academic papers, blog posts, technology, and tool documentation in order to know more and expand their scope. To be on top of current research and advances, even in other domains. The more you know, the better your craft will be. IMO reading should be practiced outside of working hours. It is a personal journey to become a better scientist.
他们应该阅读诸如学术论文，博客文章，技术和工具文档之类的文献，以了解更多信息并扩大其范围。 超越当前的研究和进步，甚至在其他领域也是如此。 您知道的越多，您的技能就会越好。 IMO阅读应在工作时间以外进行。 成为一名更好的科学家是个人的旅程。
To know by heart all the algorithms, methodologies, metrics that Scikit-learn or similar packages offer, to know about deep-learning in-depth, and to understand that the data or the source of data is the all-important element that influences every aspect of our work.
The SDS should be a doer. They must be able to deliver a feature to production. They should have a good skill in their chosen programming language, they should be able to use and implement packages in various conditions from Github. They should be able to deliver their algorithms as a class or a package based on standardized interfaces. They should understand CI/CD processed and strive for some form of development standardization.
SDS应该是一个行动者。 他们必须能够将功能交付生产。 他们应该在所选的编程语言上具有熟练的技能，他们应该能够在Github的各种条件下使用和实现软件包。 他们应该能够基于标准接口以类或包的形式提供其算法。 他们应该了解CI / CD的处理过程，并争取某种形式的开发标准化。
The SDS should be a planner and a prioritizer. Converting complex projects into small manageable tasks. Maintain constant visibility toward the organization and his managers. To prioritize tasks, while considering available resources, i.e., important tasks get more attention while others get less. To become a “fire-and-forget” member. Be self-managed in order to reduce his manager’s overhead. Be able to manage expectations and risks. Write clear and thorough design documents, presentations, and document their work. In other words, they have to be able to manage the projects, and themselves with very little to no guidance, in order to deliver a completed project.
SDS应该是计划者和优先事项。 将复杂项目转换为小型可管理任务。 保持对组织及其经理的持续可见性。 在考虑可用资源的同时确定任务的优先级，即重要任务得到更多关注，而其他任务得到更少关注。 成为“一劳永逸”的成员。 要自我管理，以减少经理的开销。 能够管理期望和风险。 编写清晰，透彻的设计文档，演示文稿，并记录其工作。 换句话说，他们必须能够在很少甚至没有指导的情况下管理项目以及他们自己，才能交付完成的项目。
The SDS should know what modern product management is, to be familiar with concepts, methodologies, frameworks, and terminology related to the product world. To understand and solve complex business problems and product questions, while considering technical and algorithmic constraints, always thinking forward on how to offer new capabilities for the features. This is by no means a way to replace the product, but a way to give a scientific-algorithmic point-of-view that the product may not have.
SDS应该知道什么是现代产品管理，以熟悉与产品领域相关的概念，方法，框架和术语。 要理解和解决复杂的业务问题和产品问题，同时考虑技术和算法约束，请始终考虑如何为这些功能提供新功能。 这绝不是更换产品的方法，而是提供产品可能没有的科学算法的观点的方法。
The SDS should understand the business. Due to their early involvement in projects, the SDS needs to understand how does the company make money, what are the business problems, how do we find and sign new clients, who are the main competitors, what are the business metrics that the company cares about, what is the client retention rate, etc.
主管/原则 (Lead / Principle)
Lastly, if you are asking yourself what are the qualities and skills of a lead or a principle DS? In short, they should be able to do everything a senior DS can do, but do it in scale. To hire and manage multiple projects and people, to see a wider, bigger picture, while still knowing the details. They need to be able to take on high risk, high impact business product problems, do the needed research and be able to deliver working solutions with ease and speed, and without guidance.
Dr. Ori Cohen has a Ph.D. in Computer Science with a focus on machine-learning and brain-computer-interface (BCI). He has led a data-science team in a smart-city startup primarily doing natural-language-processing (NLP) and understanding (NLU) research, using machine and deep learning. Currently, he is a lead data-scientist at New Relic TLV in the field of AIOps. He regularly writes on Medium.com, about managing, processes, and all things data science.