Readings for 10/10/18
Readings are now up at the regular location. They are a bit shorter this week but more dense 😸 Please post by 10/9 at noon.
Given our discussion last week, I thought I'd share two stories (one recent) about bad research practices and ethics:
https://www.npr.org/sections/thesalt/2018/09/26/651849441/cornell-food-researchers-downfall-raises-larger-questions-for-science
https://fivethirtyeight.com/features/science-isnt-broken/
Finally, a few of you have already approached me about your project. For those of you who are still thinking about it or looking for data, please see the links here:
http://www.memphis.edu/iis/resources/index.php
Given our discussion last week, I thought I'd share two stories (one recent) about bad research practices and ethics:
https://www.npr.org/sections/thesalt/2018/09/26/651849441/cornell-food-researchers-downfall-raises-larger-questions-for-science
https://fivethirtyeight.com/features/science-isnt-broken/
Finally, a few of you have already approached me about your project. For those of you who are still thinking about it or looking for data, please see the links here:
http://www.memphis.edu/iis/resources/index.php
Domingos:
ReplyDeleteDomingos argues for the need to create an interface layer for AI research. This layer creates a buffer between foundational and applicational research, allowing each to branch off into its own research without having to worry about the other’s. This makes both branches of research far more efficient. He proposes his own Alchemy, based off of his hybrid Markov logic. What remained opaque for me was why a hybrid AI is “clear[ly]” (p. 2) superior. Why is a hybrid system better than e.g. neural networks? What’s more interesting to me, actually, is: can hybrid systems better model real minds than purely logical or statistical ones? (I think LIDA is a hybrid model?) If so, is this an artifact of our programming and computing architecture, or is this hybridity pointing to some feature of real minds? Purely symbolic models of mind are not popular anymore, but I’m curious if there’s something salvageable in those kinds of methods. Pylyshyn and Fodor have a 1988 paper where they argue that connectionist networks can realize symbolic systems. While their approach is now disfavored, Domingos’ Markov logic is completely different. That being said, he is not trying to model real minds.
Ghahramani (2015):
I’d like to know more about how lossless compression works. I’m familiar with it as a computer user (e.g., gif, png, flac, zip), but I have no idea what it’s doing.
Domingos
ReplyDeleteDomingos contributes the limited rate of progress for AI to the absence of an interface layer. Domingos claims that “in most cases progress has been enabled above all by the creation of an interface layer that separates innovation above and below it, while allowing each to benefit from each other (p.1).” Domingos has developed a system called Alchemy to become the interface layer of AI. I didn’t see the date of this article, but I am wondering how Alchemy is doing in being this needed interface layer. How has AI progressed since this was introduced?
Ghahramani (2015)
This article was a lot to comprehend for me. Recently, I have heard the Bayesian concept a lot, but I do not have a full understanding. I would like to see if someone can explain this concept in class or on the discussion board for me. I think the thing that interested me the most in this article is the focus on whether uncertainty has a role whether the data is large or small. I think this is important with the concept of big data and how it changes the research field.
Ghahramani (2015)
ReplyDeleteThere were some very interesting ideas raised in this article, though I feel there is still some gap in the explanation. I thought the Automatic Statistician was a very interesting application, though other areas were less clear in their application. The Bayesian optimization section (p. 5) presented a potential solution to determining where to sample next, based on uncertainty around certain points (i.e., which point of x provides the best gain), which made a lot of sense. That being said, I am still missing how this is applied. It seems to presuppose that one has an infinite number of observations along x so that it can be sampled. How then does this uncertainty develop? Does the model use some “seed” to semi-randomly begin forming the function and then maximize “filling in the gaps” of the function? It seems like this is a very specific solution to the problem where you have an infinite amount of data and absolutely don’t know where to start, while knowing that the answer really is represented in this data, which seems to be a pretty small sample of the problems people hope to solve.
For example, in building a model of genetic contributions to cancer, I’m not sure if this offers a meaningful step forward, as the missing steps between calculating the association between the presence of a certain gene or allele and the expression of cancer (which is usually only the point at which it is diagnosed through spurious testing, or its development to a point where symptoms present itself) represent the true unobserved intermittent states. The maximizing x in this case likely represents the entirety of the unobserved development or degradation of this gene’s expression, which occurs during unobserved time, at a scale that, thus far, seems beyond measurement (e.g., watching the activity of a set of genes over an extended period of years). In this case, it seems that the classifier is positing the exact idea that one could say with a handful of observations and some intuition: we need to be able to see inside the person across time to figure out how this works.
Domingos (nd)
Although geared toward computer science, this article provided some interesting considerations for human intelligence and progress within fields. They mention how the interface leads to developments above and below the layer which led to some potential examples that made it easier for me to conceive of. fMRI seems like such an example. With its development, some researchers were able to dive deeper into the architecture of the brain, better seeing the spatial relationships between different regions in certain processes, like noting the inverse relationship between activation of the amygdala and certain regions PFC mediated by activation in the dACC, so subsequent research could more closely investigate this, say by lesion studies in rodents (forgive me, neuroscientists, if I have inaccurately described this). Simultaneously, above the layer, individuals were able to start applying this knowledge to try to show specific relationships, like associating emotion expression with amygdala activity, without needing to understand underlying processes, like the BOLD response, at any real depth.
I have avoided discussion of the actual topic here, as I really do not understand what Alchemy is, though I do like that, based on their description, this may not be too necessary if one were to focus above the interface layer on application. Its goal of creating a sort of “feedback network” where individuals working in different knowledge domains can use the system to generate predictions and then modify it to correct it when it fails until knowledge is gained and eventually used (I assume) in interactions with other related ideas does seem intuitively appealing. I’m curious to hear what the speaker has to say about it, when this idea can be presented with a more lay audience in mind.
Deepak just shared these slides; they are not assigned reading but take a look if you can:
ReplyDeletehttps://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxhYWFpMTh0dXRvcmlhbHxneDphZjE2NDExYThhMjg2ZDc
Domingos:
ReplyDeleteBecause of my limited knowledge of AI, prior to reading this article, I would not have even acknowledged that an interface layer of AI was missing. I know that models and theories and machines are being improved and enhanced on a daily basis but the influence of this missing aspect of AI was new news to me. I have to make the assumption that, with all the examples he gave of interface layers, that each have limitations/constraints/shortcomings to the layers above and below, so at what point do you compromise with an interface layer of AI? I’m sure there is a lot of complexity that goes into it, as is the topic of the article but I just wonder to what extent? I read up a little bit on the book Domingos has written—The Master Algorithm, and he seems to be very convinced that machines will become even more deeply embedded in humans’ lives than they already are, making our lives look entirely different, almost taking over them. IF that were the case, I would think that we would need to be pretty confident in the basis of AI, and it seems like we have a bit of progress to make in that regard.
As far as Alchemy goes, it seems as though the research and application of it is just at the surface. That is, Domingos makes the impression that as much has been created/put into practice of Alchemy, there are still modifications and subsequent work that needs to be done. So in response to Courtney’s question, I would be curious to know what those corrections and improvements are and what they are doing to move forward with bettering this mechanism/network?
Ghahramani (2015):
I think a lot of the material in this article kind of went over my head. I got a base understanding of the point Ghahramani was trying to make, but it wasn’t fine tune to me. This whole idea of measuring and representing uncertainty seems to have usefulness if a program/procedure can be executed to do so accurately, but in measuring what we are not sure of to what we know, when do we decide which is more important? does it depend on the research question we are asking about? or the specific results we want?
The Ghahramani article outlined the premise of Bayesian machine learning. I thought the box walking through the essential formulas was particularly useful, I think it’s common to gloss over formulas, but it’s important to remember that formulas can sometimes serve to illustrate concepts as much as they can to plug actual numbers into. The most amazing thing to me, however, was the section detailing the Automatic Statistician. I’m particularly interested in the grammar programmed in that describes the models in English. However, I always hear about how more complex models can become increasingly difficult to interpret. While it seems relatively straightforward to get the system to output English words when certain conditions are fulfilled, I wonder what the limits of constructing such a system are given how complex models can get.
ReplyDeleteI spent a lot of time going down Wikipedia rabbit holes alongside the Domingos article, and it was both fun and pretty overwhelming at times. After making what basically looks like a pocket dictionary of computer science and formal logic terminology, I think I understand that what makes Alchemy special (aside from the open-source approach) is that it combines first-order logic (i.e. which, to my understanding, is adding phrases like “every” and “some” to the front of propositional if-then statements) with statistical learning (i.e. reinforcement through probabilistic decision making). My rudimentary understanding of Alchemy, then is a general system that directs other, more task-specific systems by directing them differentially under the premises that in cases which some (but not others) are used, a certain likelihood of success is probable. Does that sound close to anybody more familiar with formal logic, and especially with computer science? Extending this, would the next step to take into a better general AI resemble an order of magnitude sort of thing, where you have a central system telling various Alchemies how to delegate their task-driven systems, or is it the case that something like Alchemy can become so fluid that this is what should drive the dynamic general AI itself?
Domingos proposes that in spite of the tremendous progress in AI in the last 50 years, more is needed to reach and surpass human intelligence. Domingos et al, 2006 introduced Markov logic, a language that combines first-order logic and Markov networks. Domingos argue that the interface layer needs to integrate first-order logic and graphical models.
ReplyDeleteGhahramani 2015 provided much needed explanations about probability, calculus and Bayesian applications. It is refreshing the concept that data and big data are useless until knowledge and inferences are extracted. He concluded that probabilistic modelling will continue to be a central part in machine learning and AI systems.
Ghahramani:
ReplyDeleteThis article was interesting, though I do feel that I didn't fully understand all of their explanation of all of the points. I know they mentioned a few applications throughout, but I would be curious of how else this can be applied. I also thought the point about "big data" really just being a bunch of small data sets to be interesting, as I haven't really heard it phrased that way often. Is this a common way to look at big data?
Domingos:
Alchemy seems very interesting, and I agree that it would be beneficial to promote interdisciplinary learning as much as possible, because we can definitely increase learning and make more progress by working in tandem with other fields studying similar concepts. Is there a way to control for or remove input that is "wrong" or based off of bad research? While I would hope that the researchers using the system would be knowledgeable in their domain and use proper techniques, I feel like less trustworthy researchers might be afraid that there may be underlying problems that they themselves might not be able to spot, since they may not be experts in that field.
This article by Ghahramani tried to explain briefly the various methods of machine learning. What are Gaussian processes and why are they better than deep-learning methods at recognizing faces? What’s the most math you need to know to understand machine learning? Can you explain what Monte Carlo methods are exactly, the article talked about them a lot. Will uncertainty continue to play a big problem in machine learning or will it be easier and easier as computers become more powerful?
ReplyDeleteThe article by Domingos talks about ways to accelerate the rate of AI progress. What is an O(n2) problem? I liked the description of Markov logic, is there a programming language that is better suited for it? It seems that Intelligent Tutoring Systems should be categorized in the interface layer box, but I’m not sure where.
Davis
Ghahramani 2015:
ReplyDeleteThis was a lot of information to take in all at once, and I think I need to read it a few times so grasp more of the in depth information it provides. I've heard of Bayesian techniques in some statistical/mathematical settings, but it's interesting seeing it applied to all this optimization and machine learning. I liked how this paper had a walk through of the actual Bayesian machine learning formulas. It is nice to read about the concepts of how machine learning works, but it provides another level of understanding for me to actually see the rules and theories in the form of equations as something concrete that I can follow along with. The issue of uncertainty in modelling is an interesting concept (that has been mentioned in a few other comments as well) and I, like Ghahramani, am curious about the role of uncertainty in modelling with the big data that we have available.
Domingos ai
For a while I've thought that AI is something that is super cool, but I never understood how it worked. Right from the beginning of this paper I was a bit shocked at the claim that the crossover point of AI reaching human intelligence would not be met for hundreds of years. Since technology, in particular AI, has advanced so quickly over my lifetime, it is a little surprising to me that it will take that much longer - but since we don't entirely know how human intelligence works, I guess it makes sense that we can't create machines that have the same level of intelligence.
I had not previously thought about the lack of an interface layer in AI technoloy, but now that Domingos has brought that to my attention it very much makes sense to me. When I use my computer I am typically using the operating system UI as the interface, when I use the internet I use a browser that acts as the interface. Thinking about how I interact with a few examples of AI in my life, such as talking to Siri on my phone or using my Google Home, and I realize that I only speak to these smart assistants and don't interact with them any other way. Does the voice(s) of Siri/Google/Alexa/similar devices act as a sort of interface between the AI and the user? Obviously these smart assistants are only a small fraction of AI technology, and AI as a whole cannot only have one type of interface layer, since there are many types of AI that have different goals. What about the AI robot that was in the news a lot this past year, Sophia? (https://en.wikipedia.org/wiki/Sophia_(robot)) Does the robotic body given to (essentially) an AI chat bot count as an interface layer?
Ghahramani 2015
ReplyDeleteAt first glance, I would have thought probabilistic models are more complex and difficult to understand, but I found it interesting that they actually end up being much simpler and transparent at each stage of learning. This seems to be very advantageous in avoiding the black box problem (such as that in neural networks), allowing for more potential causal inferences. I had not heard of the Dutch book theorem before, but it makes sense not just in the context of AI but also for its implications in any action a rational agent might take. Could this mean that all rational agents use some form of Bayesian inference and/or that any failure of rationality is simply due to the incongruency of the rational agent’s beliefs with that of the real probabilities. I also have never heard of nonparametric models. It is interesting that Bayesian approaches get around the problem of overfitting given potentially infinite parameters due to their averaging the parameters. For probabilistic programming it is still a bit unclear how the inference is actually automated. Are universal inference algorithms a feature of the language used? Or does probabilistic programming just make it possible? It seems the central component to using any probabilistic model is whether there is uncertainty involved in prediction or decision-making. I am, however, having trouble conjuring any scenario in which there isn’t some degree of uncertainty. Even in complete knowledge games, there is still a degree of uncertainty as to what move the opponent will make next. So, if that is the case, what kinds of problems don’t to at least some degree contain some degree of uncertainty?
Domingos
This concept of an interface layer certainly seems to be beneficial to innovation, allowing the separation of research into two different realms that both drive a field forward in different ways. I am curious how interface layers typically develop, though. Are they something specifically designed for this purpose? Or are they more a natural development of work in that field that so happens to function as an interface layer? If the latter is the case, it may be very difficult indeed to consciously construct something to act as an interface layer. It seems as though this paper has proposed Markov logic, more specifically Alchemy, as a constructed interface layer. According to Domingos, Alchemy integrates both logical and statistical solutions rather than using one or the other or simply creating a two-stage solution with one followed by the other. I do agree that the simpler it is for higher-level researchers to use a system and create models without having to worry as much about the lower-level processes, the faster progress in the area can be made and that potentially makes Alchemy very exciting. While I don’t know anything about Alchemy save for what is in this paper, I certainly want to learn more about it now.
Ghahramani (2015)
ReplyDeleteIt is interesting that probabilistic modelling can be used to design machines learning from experience. Even though I tried hard to understand probabilistic machine learning, I am still not very clear about it. What are the differences between probabilistic machine learning and other machine learning techniques or concepts? And, what are the advantages of probabilistic machine learning comparing to others? For example, deep learning.
Domingos
The interface layer is a layer that connects applications above the layer and infrastructure below.
Domingos said it is hard to find a general interface layer for AI. They developed the Alchemy system to be the interface layer. Is this system widely used now? Are there any examples using the system?