Indeni and Python

Hawkeye_Parker · May 23, 2018, 4:10pm

I keep seeing Python mentioned in conjunction with Indeni. Does Indeni somehow ‘support’ Python scripting? Can we/will we be able to leverage Python as part of .ind scripting?

Alon_Ashkenazi · May 24, 2018, 2:38pm

Currently indeni language doesn’t support Python. Can you share some use cases of how you will use Python with indeni language?
Would you want to use python in the parser section?

Hawkeye_Parker · May 24, 2018, 4:07pm

Definitely. Thanks for asking, Alon.

Yes, in the parser section. The huge benefit I see there would be code reuse. Python supports reuse through import, and so we could create re-usable classes/functions (libraries) for common tasks. As it stands, there’s a lot of duplicate code in the .ind source, and some of it is downright scary (e.g., checkpoint objects.C parsing). I’m sure you know, there are many, many good reasons to avoid duplicate code. I think it’s already possible, generally speaking, to “call” python scripts from AWK, but maybe the indeni infrastructure could make this easy (similar to the AWK ‘helper functions’).
In the case of AWK, AWK is great at line by line pattern matching, but, the language itself is a mess. Python is a full-featured, well-documented, well-established, modern programming language. It’s arguably the most intuitive language to learn, and it’s well-worth learning for anyone involved in tech. So, I can also imagine replacing the entire AWK parsing section with Python. The learning curve might be slightly steeper than for AWK in the beginning, but I think it would be much more attractive to potential IKEs, and in the long run, the code could be MUCH more robust. One big benefit would be the ability to leverage existing Python unit testing frameworks. command-runner is good for what it is (and I’m very glad we have it), but it’s a far cry from a “real” unit testing framework. Higher quality unit tests --> higher quality code --> higher quality product.

Those are my basic thoughts.

Alon_Ashkenazi · May 24, 2018, 4:13pm

We chose AWK because it’s closer to the day to day tool that IT engineer will use and it will be more easy to pick up. How do you think that the IND adoption/ learning curve will change if we will change to python?

Hawkeye_Parker · May 24, 2018, 5:23pm

That makes good sense, and I can definitely see why you would have made that choice. In terms of the learning curve, let’s assume an IKE who has no python experience but has worked with AWK on the command line (but not written complex AWK scripts). I would argue that, with regard to the basic language features, the learning curve is easier with Python: the documentation is better – the basic Python docs, and the many, many blogs, sites, SO posts, etc. AWK docs are weird and old, and there are many different version of AWK. IKEs have reported getting scripts/features working on their local linux/mac environment, and then having the script fail cryptically in command-runner. Similarly, the Python ‘error feedback’ help (i.e., script syntax errors and exceptions) are much more helpful and easy to decipher than what we get from AWK.

To put it another way: for a simple AWK script, things are pretty simple. But, if you start trying to do anything moderately complex, things get kind of bizarre. I’m coming from a software dev perspective (Java, C++, Python, etc), so maybe it’s just that I’m sensitive to the non-standard awk weirdnesses. But, I also think things are plain confusing and error prone in awk. So, once you get into more complex logic/data structures (e.g., awk’s associative array), I think Python is just so much better. You have variable scope. You have real arrays (python lists) and real maps (dictionaries). You can create simple but coherent collections of data; and, you can actually test any code you write. These things are really fundamental to writing readable, predictable code. With AWK, we end up hacking together strange solutions that sortof work but are definitely not clean or “normal.”

On the other hand, because there IS much more to the Python language, there’s ultimately a bit more learning. But, it’s generally learning that at least ‘makes sense’ in the larger context of scripting and programming.

I also think that the indeni platform could (should) do things to make the learning curve easier (and make Python less ‘dangerous’). Certainly limit the available Python libraries (e.g., you don’t want people making http requests in the middle of the parser section). I can also imagine some kind of custom ‘framework’ for .ind scripts that makes it easier to leverage Python within the parser section. I don’t know: somehow providing ‘awk-like’ line matching, but where the ‘meat’ of the script is Python. Just an idea.

Regarding unit testing: this would definitely add to the learning curve. Sortof. It’s already somewhat tricky to figure out testing with command-runner. But, I think it’s fair to say that learning to use a real unit testing framework, for someone who’s never done it, isn’t trivial. I think the python unittest library is pretty clean, fwiw. command-runner is nice in that it’s just record and replay. I can understand that indeni would want to keep testing really straightforward; you don’t want IKEs lost in the minutiae of trying to write overly-complex unit tests, and I know that there’s a real art to this. So: maybe this is a drawback. And, at the same time, I know you could get better coverage and more robust tests with a real unit testing framework.

Hawkeye_Parker · May 24, 2018, 6:31pm

Devil’s Advocate: of course, I could be really underestimating the difference between the complexity of AWK and Python. Coming from my perspective, AWK is nutty, but if I really had no programming background, maybe it would just be ‘how it is’ and no more or less hard to learn than Python basics. The AWK language is certainly much “smaller” than Python, and in that sense, easier.
Conceptually, learning to use Python might(?) require learning at least the basics of OO programming. Hopefully we could keep things simple enough that no one had to try to understand, e.g., polymorphism. You might even be able to (somehow) make sure people didn’t even use classes at all, even if it was just by decree. Basic classes/objects don’t seem too hard, but even understanding what a constructor is – maybe that’s trickier than I think…

Joakim_Hedberg · May 25, 2018, 7:27pm

So, i’m not far into the parser part of indeni, but from what i’ve seen its not always simple. You use three different types (awk/json/xml).
Using python i’d gather you could replace all three with one language that is pretty versatile, you could do a lot of work and people only need to learn one thing(i’m oversimplifying of course).

Upside:
Python has a nice way to translate stuff to json which is what Indeni really needs right?
It’s flexible and fairly simple to learn(at least the basics)

Downside:
Speaking as a rules-person I think Indeni has a tendency to forget to document what happens behind the curtains, giving IKEs only half the picture.

To consider:
How is Python resource-wise vs awk/json/xml? Not sure about the current interpreters, but Python of course needs it’s own mothership to translate it into actual actions.

If Python is the way to go(and I really like the idea), people needs to know about the base. I’m thinking for instance if you inherit a class, you need to know what’s going on in the base class
I’m not pointing any fingers here and I don’t mean to sound bitter, that’s not the point. I have however noticed that the documentation in the wiki is lagging behind the engineer teams updates. If you’re considering to use yet another language with inheritance etc this should be a concern.

So to summarize from my perspective
Python good
Poor documentation bad

Vasileios_Bouloukos · May 27, 2018, 1:28pm

First, I have to mention that I have very limited experience with python. It can be found below a list with pros/cons that we may face using python.

pros for using python

better support/documentation/forums
easier to implement complicated logic (better data-structures, loops, switch e.t.c)
higher probability to find IKE with experience in python (or willing to learn) in compare to awk
probably the json/xml parsing could be done using similar logic

cons of using python

awk is specialized for parsing, in case of python we will need somehow boilerplate code to be copied (for instance something to trap lines with specific patterns)
with python it will be easier to solve the same problem using different solutions/logic… this will cause debates that may increase the ‘review-process’ (sometimes is better to have a single path of solution) 
we will need guidance (maybe templates/use cases)
(optional) extra time to move the existing awk to python.

Summary
I am positive to try python. Not sure if this will improve our productivity in short-term, but for sure it will be better for complicated scripts.

Patrik_Jonsson · May 28, 2018, 6:16pm

Not sure what I feel here so I’m going to pour down my thoughts in a mass of text.

I know a bit of Python, but spontaneously it’d feel like if you’d take a simple-ish language that is designed to parse text files and replace it with a fully fledged object oriented language you’ll going to end up with confusion. For instance, I’m a real fan of Powershell but I would still prefer awk if I were to parse a text file. That said, using Python would open up all kinds of goodies too, just like Hawkeye has written.

More complex script language would likely also mean more complex solutions. Just afraid that we’d have a mix of scripts most people don’t understand because they’re too advanced and scripts nobody understands because they’re poorly written.
A bit more on the same topic. A more complex language would make it harder to enforce conventions. One of the good things today is that going from script to script is fairly easy since they’re “simple” most of the times. This could be fixed by a some kind of framework though.
Going from Python to AWK is much simpler than going from AWK to Python. If we have issued finding people because they don’t know AWK I don’t think this is the solution to that problem.

If the main issue here is code de-duplication perhaps there’s a way around it by modifying the awk engine?

/Patrik

Hawkeye_Parker · May 31, 2018, 11:26pm

The more I think about this, the more I actually agree with Patrik here (and disagree with some of what I originally wrote). As weird and limiting as the awk ‘array’ is, it’s pretty bullet-proof in terms of performance: you don’t have to think at all about whether or not to use an array vs. a set vs. a map, and more importantly, you don’t need to know anything about data structure time complexities, etc. We usually deal with such small data sets that, for a given script, it doesn’t really matter. But, things always add up with scale.

As other people have mentioned in this thread, the learning curve with Python is just much bigger, and the potential for confusion/bad code/disagreement, etc., is much higher than with AWK.

For me, the main issue is with duplicate code/code reuse. And, for a lot of this, we wouldn’t need Python directly in the parser section. We just need an extension mechanism like the existing AWK ‘helper functions’, but we need some way to scope the functions to a sub-set of scripts – we probably don’t want to pollute the global namespace with a bunch of functions that are really specific only to Checkpoint. Ideally, from my perspective, IKEs would have the ability to write these kinds of functions themselves, but maybe we just need to get more vocal about getting support for this kind of thing from the indeni platform team.

BUT, I think there are certain cases where we need a better solution directly in the parser; or, at least, something more than just a helper function. My best example is still the objects.C parsing code in Checkpoint. This data set is big enough that we need to stream it – we can’t collect it into memory and hand it to a function. Honestly, I don’t want to speculate on various solutions to this problem here, but certainly having access to Python directly in the parser section is one solution.

aldasouqi · June 1, 2018, 12:40am

As i understand that you have problem with duplicate code/code reuse …from python view there are technices named loops like (while loop or for loop) that we can use it to reuse the code we want where, when and how many we want also.
And as some one said “I’ve just seen one too many unreadable, 1000-line awk programs to do something that’s a dozen lines of python.”

r808 · August 10, 2018, 10:31pm

I came across Pyed Piper(pyp) recently and thought something like this might be a good option. It has bunch of the functionality of AWK but it is Python based.

https://code.google.com/archive/p/pyp/
Pyp is a linux command line text manipulation tool similar to awk or sed, but which uses standard python string and list methods as well as custom functions evolved to generate fast results in an intense production environment.

Intro video

Hawkeye_Parker · August 14, 2018, 5:47pm

This is a really interesting tool, but I don’t quite see how we would use it in IND scripts. It looks to me like pyp is a commandline tool. Our SSH commandlines in IND are limited to whatever is supported by the target device (i.e., pyp wouldn’t be there).

In terms of somehow replacing or augmenting AWK in IND, maybe I’m missing something, but I have a hard time imagining how it would work. Our AWK scripts aren’t one-liners (and I don’t think they should be): they’re ‘actual scripts’ – multiline programs.

r808 · August 14, 2018, 8:35pm

Basically the way I see it working is the indeni server retrieves the response of the ssh command and runs pyp on the response locally.
I agree pyp or something like it would have to be expanded to meet the use cases such as multiline but I believe this is doable and would give additional feature beyond what awk is capable now.
Python is the number one language used in network automation and supporting a “lite” version of it I think would grow interest and the community.

Hawkeye_Parker · August 16, 2018, 6:24pm

I get it. Absolutely agree with the ‘lite Python’ idea. Also found this, see “Sandboxing”:

http://pypy.org/features.html

I built CPython many years ago. IIRC, it wasn’t that hard and it wasn’t that easy I expect it’s easier now than it used to be.