Asset managers are increasingly challenged by their investors to manage their portfolios for social
                        impact. However, it is non-trivial for investors and investment managers to maintain regular
                        oversight over the social impact composition of their portfolios. The U.S. Securities and
                        Exchange Commission’s (SEC’s) EDGAR maintains a digital record of the portfolio filings of
                        publicly traded asset managers.
                        
                        
                        One way to track at which extent the investors, and to a greater extent, the great public is interested in social good
                        projects is by analysing the data from social media platforms such as Twitter. With this information,
                        one could also link the social opinion about a given company and how much the enterprise invests in
                        social good.
                        
                        
                        In this research project, the term social good was defined according to the Sustainable Development Goals (SDG).
                        These SDG where defined are a collection of 17 global goals set by the United Nations General Assembly in 2015.
                        They aim to "transform our World: the 2030 Agenda for Sustainable Development." That has been shortened to "2030 Agenda."
                        The goals are broad and interdependent, yet each has a separate list of targets to achieve.
                        Achieving all 169 targets would signal accomplishing all 17 goals. The SDGs cover social and
                        economic development issues including poverty, hunger, health, education, global warming,
                        gender equality, water, sanitation, energy, urbanization, environment and social justice.
                        In the frame of this project, investments in social good were defined as being an investment in
                        any of the 17 different goals stated by the United Nations.
                        
                        From these SDG's, we decided to put the focus on three following major targets:
                        
                        
                    
Decent work and economic growth: By 2030, the target is to establish policies for sustainable tourism that will create jobs. Strengthening domestic financial institutions and increasing Aid for Trade support for developing countries is considered essential. Trade-Related Technical Assistance to Least Developed Countries is mentioned as a method for achieving sustainable economic development.
Industry, Innovation, and Infrastructure: Build resilient infrastructure, promote inclusive and sustainable industrialization, and foster innovation"
Responsible consumption and production: The targets of Goal 12 include using eco-friendly production methods and reducing the amount of waste. By 2030, national recycling rates should increase, as measured in tons of material recycled. Further, companies should adopt sustainable practices and publish sustainability reports.
                            
                            
                            In addition to these qualitative criteria, it was possible to the data from JUST Capital -
                            America's ranking for the most just companies. JUST Capital measures and ranks companies on
                            the issues Americans care about most so the Americans can then act on that knowledge.
                            The aim is to influence purchase decisions, investment dollars, career choices of the people
                            that are living and working in the US. Thus people should have the power to make the world a more just place.
                            
                            JUST Capital was co-founded in 2013 by a group of people from the world of business, finance,
                            and civil society. The organization as a not-for-profit registered charity, the founders
                            ensured that JUST Capital would be exclusively geared towards achieving its mission. JUST Capital
                            ranking annually all the companies in the US in their annual report.
                            
                            This ranking was then used by our group as a factor for the Social Good score of each company.
                            Details about the calculation of this score are given bellow in the The Ethics of Investing
                            part.
                        
                            For this project, two datatsets were used. The first dataset consisted of the
                            U.S. Securities
                                and Exchange Commission (SEC) archives. It was possible generate a
                            .csv-like file listing, by company and date since the URLs in the SEC's archives, where the
                            investment portfolio filings are stored, has consistently structured text and HTML table data.
                            We were able to automate the extraction and so to get the type and scope of different types of asset
                            manager investment holdings (what companies they invested in, the size and value of each investment).
                            This data can be enriched by processing additional social data from twitter related to portfolio companies
                            into signals of their social impact and mapping these social impact signals to investment portfolios.
                            
                            Based on this data from investment portfolio filings, we could identify and analyze investments
                            made in companies that pursue goals related to social good from all different kinds of industries.
                            
                            Nevertheless, one need to be careful with the interpretation of the data. This data set contained
                            biases that one need to be aware of before drawing general conclusions from the data.
                            First, the data is collected quarterly and contains information about all the major players
                            of the US stock market. It would have been interesting to have the same kind of information
                            about Switzerland and the rest of the world also. However, the SEC dataset was the most
                            complete and consistent information we were able to find. Therefore, any conclusion that we could
                            make is likely to be true from the USA but not necessarily for the rest of the world. 
                            Secondly, the dataset contains only the institutional investment managers with holdings over 100M 
                            Furthermore, the 13F form is required to be filed within 45 days of the end of a calendar quarter
                            (which should be considered as significant information latency) only reports long positions
                            (not short) different investment managers pursue different strategies with may bias results.
                            However, the vast majority of investment managers rely significantly on long positions for significant
                            portion of fund performance. 
                            Another drawback is that the 13F form does not reveal international holdings (except for
                            American depositary receipts) and excludes total portfolio value and percentage allocation of
                            each stock listed 
                            The Section 13(f) securities generally include equity securities that trade on an exchange (including Nasdaq),
                            certain equity options and warrants, shares of closed-end investment companies, and certain convertible debt securities.
                            Shares of open-end investment companies (i.e. mutual funds) are not Section 13(f) securities
                            The official list of qualifying securities
                            
                            Finally, it is necessery to point out at 13F does not represent the whole portfolio of the investos - it's a past snapshot.
                        
The twitter dataset consisted of the tweet from 2017. The research question that needed enrichment with social media data were all focused on the evolution throughout time of the opinion regarding selected companies. Therefore, we did not need to take the data from everyday. It was decided that it would be wiser and sufficient to take 10-days samples through the year 2017. This made the calculation feasible in Pandas. Since we were looking for information about investors and companies in the USA, we decided to only look at the tweet in English. This reduces the number of tweet per day by roughly 70% the majority of the Tweet are written in the mother tongue of the user (Chinese, Japanese, German, ...)
                        Ultimately, our data story documents our journey into understanding the social good impact performance of large asset manager investors, 
                      and how the prevailing public opinion supporting socially responsible investing may have influenced asset managers to be more socially impactful. 
                      In order to explore these topics, we composed five questions, each of which are documented below as a data substory. 
                      
                      
                      We first seek to understand the nature of our asset management investor social impact dataset. 
                        In this First part, the first research question of the project was described to in the
                        Who owns Who? substory.
                        The aim of this story was to give to the reader a better insight of the data. Furthermore, the
                        way the data have been cleaned was presented. From the data it saw possible to observe that, for example,
                        the lowest scoring investor had a very specialized investment strategy (event-driven investing).
                        On the other end of the spectrum, the highest scoring investor, Aetna Inc, is a healthcare company,
                        and is thus intrinsically driven to invest based on values aligned with social impact outcomes.
                        
                        
                        After this first part dedicated to data exploration, we attempt to understand the nature 
                      of high and low social impact investments in an asset manager's portfolio. For example, why do these investors choose 
                      to invest in certain stocks that bring a positive impact, or are detrimental to society? And what is the logic behind these motivations?
                      This the second substory is also prefaced by a definition of
                        the terms ethics and assets since these terms will follow us all along of our journey through the
                        data. Through The Asset Manager's Portfolio substory,
                        we show the popularity of the high and low impact stocks and we ended up showing that an investor
                        could easily end up selecting a portfolio with more high social impact investments simply based
                        on looking for stocks with high historical financial growth - technology companies such as Amazon and Apple.
                      
  
                      
                      Although The Asset Manafer's Portfolio data substory allowed us to observe a pattern of asset management 
                      investors making greater numbers of high social impact investments than low social impact investments, this conclusion was specific to 2017. In order to determine if investments
                        in social good are a recent trend or if there is a tradition to invest in ethically favorable assets
                        we chose to compare the data of the previous year. By doing so in the Towards Social Good  substory,
                        it was possible to see that investors do not seem to be, as a general trend, shifting towards higher social impact investments.
                      
                      
                      
                      
                      But what does the great public think about these investors. Does the great public care about
                    the ethics of the companies from which they are buying product? In order to answer this sentiment analysis
                    on the Tweet of 2017 was used. The data we found is show in Sentiment Analysis.
                    It was very interesting to see how tht great public thinks about the assets. However, no clear correlation
                    between social opinion and social score was to be determined.
                    Finally, the geographical location of the most ethical companies was also studied in the Investment Geography of the USA.
                    It was possible to see that there was not clear geographical trend regarding social score and
                    geographical location of the companies.
                    As an teaser for further work and what could be added to our project we created the
                    Other Countries substory.
                    The aims would be to make the same kind of research for other countries. Therefore, we invite all curious
                    readers to extend our project by forking our project on github.
                    
 
             
                            Basics and fundamentals of Data Analysis were studies in class during the course. Four different graded homeworks were supposed to guide the students through the course material
 
                            The project repo contained a README describing your project idea (title, abstract, questions, dataset, milestones, according to a provided skeleton).
 
                            the project repo contains a notebook with data collection and descriptive analysis, properly commented, and the notebook ends with a more structured and informed plan for what comes next (all the way to a plan for the presentation). These sections of the notebook should be filled in by milestone 3.
 
                            Data story in a platform of your choice (e.g., a blog post, or directly in GitHub), plus the final notebook
 
                        Computationnal Science & Engineering (CSE)
 
                        Computer Science (CS)
 
                        Computationnal Science & Engineering (CSE)
                        We wanted to take this opportunity to thank you as a reader for taking the time to read what we
                        had to tell. Furthermore, we wanted to thank the whole ADA Teaching Stuff for the great job that
                        they did all along of the semester. It is certainly one of the most time consumming classes at EPFL
                        but we learn a lot of thing for the future.
                        
                        
                        Merry Christmas !