Category: Daily Report

  • Automation Flows In Companies- Days 96, 97 and 98

    Over the weekend I had a ten mile walk into the countryside, and a four hour brainstorm with my business partner. All good stuff.

    Today I go back to N8N, the workflow solution. Am following along the official tutorial for the moment. It’s clear to me that all companies that embrace automation properly from the ground-up will save a huge amount of time, freeing up their staff to focus on the important stuff. Whilst not technically AI, automation is clearly part of the new paradigm change that will begin happening across enterprises, especially the SME sector which hasn’t really normalised it yet.

    N8N Scheduling

    If you are going to build an automated network of actions for your company, then having scheduled triggers are a must. Of course you can use CRON or jobs in frameworks to do this, but having it in a nice easy UI is just way better.

    In this example, you can see that you can easily setup these triggers with a lot of flexibility. They can be triggered every second or just once a month. For each time period, there are different types of options. So just having these triggers can give you some food for thought in what you might want to start automating.

    Off the top of my head:

    • At the start of every week I might want a script that goes out and captures all the relevant news for me from several RSS feeds. Those can then be put into a workflow.
    • Or I might want to send out an email to the managers of departments to ask for a progress report.
    • Or, could be an email shot, a sales report generation, a slack message, a LinkedIn message or it could check whether tasks have been done last week, and to alert whether there is anything top priority outstanding.

    Anyway, point is, that time triggers give you a huge amount of flexibility. In the above image you can see that these triggers begin flows (currently empty!) that start at the beginning and end of the work week.

    The NASA Node

    Whatever you might think about NASA and the space industry in general, it’s pretty cool that you can just access NASA information from N8N. It takes one minute to get an API key from them, you throw it in N8N and you can now get Solar Flare reports from the last week …

    You can also Test the step and see the output:

    N8N Conditionals

    You can then take the output from any node. In our case the NASA feed. What I like about this, is that N8N will show you the output of any API calls, and allow you to drag any field easily into the conditional statement. So here I have dragged ‘classType’ from the left and put it into the conditional, and searched for the string ‘X’ which is a class of solar flare.

    When you test, you get a lovely UI showing the true or false.

    The next step was outputting to postbin, which wasnt working as expected. I will continue with this tommorrow.

    Automation Software Cuts Costs

    I don’t know why I haven’t played with automation software more over the years. Probably because I was focused on coding. N8N abstracts away so much code to make this work, and unless you have a working foundation that can call any new API robustly and output it all easily … you would have to start from scratch and it would take a while. I just love how automation software takes away this hassle, and you can still integrate it with your internal systems if you want to.

    Licencing

    N8N has an enterprise licence agreement so if you ever want to self-host commercially in production, you still need to pay them a licence agreement. I totally get why they do this, but it’s annoying since I wanted to use this open source software freely. I think it means you can use it internally for free, but as soon as you want to build a product with it, the enterprise licence requirement kicks in.

  • Day 95 – HeyGen!

    Had a call with the guys yesterday at HeyGen.com

    https://www.heygen.com

    So these guys generate avatars for use in influencer marketing. Some key notes are:

    • They have an API you can use
    • Multilingual
    • Pre-generated ones can basically say whatever you want
    • You can speak to one of these avatars and they will talk back to you. Not pre-generated.

    During the meeting they were also able to take a 4 second clip of me in the video, and basically render it and make me say anything. The mouth movements and voice are not super great, but for a four second sample it was good.

    It’s clear to see we will just be interacting with these agents on websites, and as you talk to them, they will present the data necessary to complete your goal next to them.

    I imagine OperatingSystems will just have these inbuilt. Back to ‘Clippy’!

    High quality human avatars driven in real time by LLMs that you can talk to, will be the future of e-commerce.

    It was always going to happen, and it’s been tried before but now it looks like the tech is going to converge.

    Websites will just have a shopping assistant now that talks to you, suggests products, and you wont have to do as much searching, or even typing (which might be a good thing for those of us who’ve coded for years) …

    Psychologically humans will connect more with another human looking avatar helping them to complete their goals.

    You will talk to it, the LLM runs off using RAG to the objects available for sale, and then will come back the solution … and present it to the user. If it’s clothing, the avatar may just change clothing to demonstrate it.

    Potentially further into the future, the avatar will interact with the product, like holding a laptop, etc … unsure exactly how, but I imagine eventually all products will be showcased in 3D so the avatar will be able to work with that somehow.

    Interesting times.

    Every day I see something that makes me both sad and excited, depending on what perception I want to take!

    Vercel AI Chatbot

    Carrying on from yesterday, I managed to switch the local Chatbot to OpenAI and so now I have got the local UI up and running and hooked up to OpenAIs API. And I can change the colour scheme now.

    Ok, that’s it for today’s R&D update.

    Will also have some news on the DXP project soon.

    We’ll also go through N8N workflows from WordPress to the socials.

  • Day 94 – Every company will just have their own internal chat

    A quick detour away from N8N today in my R&D time…

    The time has come where the app I’m working on really needs a traditional ChatGPT interface.

    I did have a developer spasm and think about building the UI from scratch for fun, but …

    Fortunately Vercel do some amazing templates out of the box for different purposes:

    https://vercel.com/templates/

    Vercel do an official ChatBot one and there’s a guy on Github who does a more feature full one albeit hasnt been updated for a while

    https://github.com/mckaywrigley/chatbot-ui

    The official Vercel one is updated more regularly, so I’m just going to use that. Below is how it looks on the demo. I’m currently trying to configure some bugs in it, but it will be great to get this up and running locally so I can start testing it to see if I really do need to start from scratch. It’s already integrated with Auth.js which I dont really want.

    The other one is based on Supabase, which is an incredible offering. But will stick with this for now.

  • Day 93 – Nocode Automation Tools

    Yesterday took a good step forward on a stealth AI project I’ve been involved with since January. It’s operating in a crowded market, but we’ve recently identified a better way to pivot. Will be able to announce it soon, once final alpha tests have been done.

    Also this week we have potentially found a new addition to our management team – a proven AI lead generation specialist. But more on that another time.

    Today I began taking another look at N8N, I’d already tried a first time and was impressed. I was able to quickly create a workflow that read a webpage, summarised it, and then pumped it to Slack as a message.

    Honestly, these automation tools WILL be part of every single company going forward. They will replace a lot of internal software platforms.

    My first aim is to go from a WordPress post to publish on:

    • X
    • LinkedIn
    • Facebook
    • Instagram

    There is already a template setup, which is what’s so powerful about this N8N stuff – most of what you want to automate has already been done.

    • Installing N8N with Docker is super straightforward
    • Look at the automation templates: https://n8n.io/workflows/
    • For local use it’s super easy to create a webhook tunnel

    https://n8n.io/workflows/3086-publish-wordpress-posts-to-social-media-x-facebook-linkedin-instagram-with-ai

    Today’s video analysis is of the guy who made Windsurf – which I haven’t used yet, but am going to give a proper try. Might prefer it to Cursor!

    Main takeaway is:

    • Sales team combines AE and forward deployed technicals

    In other news, this is a real time object identification locally that I want to try out

    https://github.com/ngxson/smolvlm-realtime-webcam

  • Day 92 – OpenAI API Structured Outputs

    For me Structured Outputs are essential when talking to an LLM. In most cases, an application needs to use data in a certain way so you want OpenAI to send back data in a specific structure, rather than having to format yourself.

    Structured Outputs (SOs) ensure an OpenAI response is structured in the way you would like.

    Prior to SOs you would use a specific prompt to get the response in the right structure.

    Parsing a Rental Property Webpage

    Let’s set a schema that will take a blurb from a Cottage Rental website and output an object …

    import { NextResponse } from 'next/server';
    import OpenAI from 'openai';
    import { zodTextFormat } from "openai/helpers/zod";
    import { z } from "zod";
    
    export async function POST(request) {
      try {
        const client = new OpenAI({
          apiKey: process.env.OPENAI_API_KEY,
          organization: process.env.OPENAI_ORG_ID,
        });
    
        const requestData = await request.json();
        const { model, input } = requestData;
    
        const PropertyListing = z.object({
          cottageName: z.string(),
          numberOfGuests: z.number(),
          numberOfBedrooms: z.number(),
          numberOfBathrooms: z.number(),
          allowsPets: z.boolean(),
          keyFeatures: z.array(z.string()),
          marketingCopy: z.string()
        });
    
        const response = await client.responses.parse({
          model: model || "gpt-4o-2024-08-06",
          input: input,
          text: {
            format: zodTextFormat(PropertyListing, "property"),
          },
        });
    
        return NextResponse.json({ 
          success: true, 
          event: response.output_parsed 
        });
      } catch (error) {
        console.error('OpenAI API Error:', error);
        return NextResponse.json(
          { success: false, error: error.message },
          { status: 500 }
        );
      }
    } 

    I can call this API crudely with something like:

        try {
          const response = await fetch('/api/openai/parse-cottage', {
            method: 'POST',
            headers: {
              'Content-Type': 'application/json',
            },
            body: JSON.stringify({
              model: "gpt-4o-2024-08-06",
              input: [
                { role: "system", content: "Extract the cottage information." },
                {
                  role: "user",
                  content: `webpage html here`
                },
              ],
            }),
          });
    
          const data = await response.json();
          
          if (!data.success) {
            throw new Error(data.error || 'Failed to parse cottage');
          }
          
          setEvent(data.event);
        } catch (err) {
          console.error('Error parsing event:', err);
          setError(err.message || 'Failed to parse cottage');
        } finally {
          setLoading(false);
        }
      };

    The response I get is

    {
      "cottageName": "Point Break",
      "numberOfGuests": 8,
      "numberOfBedrooms": 4,
      "numberOfBathrooms": 3,
      "allowsPets": true,
      "keyFeatures": [
        "Beachside location",
        "Contemporary interiors",
        "Part of The Dunes development",
        "Traffic-free location",
        "Open plan living",
        "Floor to ceiling windows",
        "Balcony with ocean views"
      ],
      "marketingCopy": "A beautifully finished beachside house over three floors with contemporary interiors to the highest standard. Part of The Dunes development located directly in front of the beach. Tucked away in an almost traffic-free location. Open plan living, dining and kitchen space on the 2nd floor, taking advantage of the breath-taking views over the golden sand dunes, beach and Atlantic Ocean. The spacious living area comprises of two large sofas, TV, DVD player, coffee table, floor to ceiling windows, patio doors opening onto the balcony."
    }

    So, what that’s done is convert some HTML into a JSON object that summarises the HTML. Useful!

  • Days 85 – 91 – Dropped into a part-time shit storm and extracting myself from it.

    Days 85 – 91 – Dropped into a part-time shit storm and extracting myself from it.

    100 days ago I decided to focus on becoming an AI startup. I gave myself a time limit (five months left) to demonstrate to myself that I could get something up and running.

    I was getting somewhat anxious about having no income coming in, so six weeks ago I took on a part-time freelance project for a company that was switching their booking system.

    I had planned for a nice ongoing earner with 1.5 days a week worth of work. Giving me plenty of time to focus on AI stuff.

    What actually happened, was I got dropped into a shitstorm.

    That said, I got them launched, and have backed out now. I won’t go into details to spare them the blushes, but did think I would write some generics about IT project management.

    Why IT projects go bad, and what to do when they do.

    Software projects can essentially have infinite scope. There is always something that can be tweaked or added. Add business management to the mix and they will demand a long list of features …

    1 – You didn’t clearly agree scope with all stakeholders at the beginning, and get them to sign off on it.

    All parties have a responsibility to be as clear as possible what the project is going to do. With money changing hands it’s absolutely essential.

    If you are just prototyping, it’s slightly different.

    You would rarely start builders on a home project without seeing some form of work blueprint.

    In my early career, I ignored scope and most of the time I got away with it. But as soon as it’s something more complex, it became essential.

    The work spec is essential because:

    • You can protect yourself when the customer asks for more than they are paying for. As soon as the customer says, ‘well I thought I was getting X, I really need X’ … you can direct them back to the initial agreement. ALWAYS build what was initially agreed, release that, and then build the new feature afterwards.
    • If done correctly, the work spec will force the thinking in the development team of HOW they are going to implement it, broadly speaking, and so the system architecture begins to form.
    • New management or developers don’t know where they are without the work specification.
    • Leads to increased probability that the cost is accurate
    • The customer is forced to think what they actually want

    Don’t mistake flowcharts for a specification. Documenting scope is a skill. Flowcharts are an essential part of the spec, but they still don’t describe what is being built.

    2 – The initial scope was wildly excessive, or the customer adds ongoing scope throughout the project (and you let them)

    Since software scope can be infinite, it’s easy for a customer to say they want everything. Writing a work spec helps against this, but if you don’t really think about the potential detail involved you’ll get the cost wrong.

    In most cases, a good developer should identify the handful of functions that actually represent the core USP of the system.

    If you are building a new product from scratch, always identify the very core of what you are trying to do and focus on that. Get those shipped and out there, and begin testing from there.

    If you are halfway through a project, the BIGGEST problem I’ve always found is that management will add onto the scope rather than goto launch. They will insist feature A needs to get done with some essential feature. This becomes a habit and the product NEVER gets launched.

    I worked on a project for two years and toward the end, management kept asking for more features … AND I LET THEM (at price)… I was naive … and eventually upper-upper management came into the project, the manager demanding the new features said ‘they haven’t finished’ … upper management got cold feet and pulled the project. We were so close to releasing a product that was perfectly timed to do well in the real world. Yet, it failed to launch because scope kept getting added.

    3 – The developer never documented the system

    This is a big one. Doesn’t seem it at the start, but eventually you need to bring other developers in… do you really want to explain repeatedly how the system works?

    Documentation should be as concise as possible. You will get egos in the team who think they demonstrate their intelligence by producing a 100 page spec that no-one is ever going to have the time to read.

    Documentation isn’t the initial project scope. A lot of programmers say that the code should explain itself. It’s correct, but also misses the point. Large codebases should be described (AI can help a huge amount now) … there’s always things that fellow developers need to know.

    4 – You are building on a system that is shit

    This is something that isn’t easily fixed. When you are working on a system that hasn’t been thought through properly; and just been cobbled together over the years… it really does become a major problem. Developers don’t want to work on it, they get stressed and they will mimic the state of the codebase by taking further shortcuts. And the problem compounds itself.

    5 – You are under-resourced

    There’s no way around it. If you don’t have the money to build your system then you will suffer. And if you are a technical founder, if you don’t have the time … then you are not going to get there either.

    You avoid this by:

    • Being very clear about what you want and don’t want
    • Your scope is realistic
    • Your work brief has identified how much it will REALISTICALLY cost by working out how to do it

    If you are deep into a project and money has run out… it’s generally because you’ve messed up further down the line. The main thing you can do is give equity away to a decent developer and ask them to work at minimum wage for a while on it, to bring it up to the correct level.

    Or, you just take out another credit card, but you get very clear about what you are doing or not doing.

    6 – Poor communication (too little or too much)

    The main problem with this last freelance project was that the two developers weren’t successfully communicating via emails. They were working, effectively, in different timezones … when I came in, I was able to bridge that gap and in just a few calls I had resolved blockages that had been around for quite a while.

    Emails are not the best form of communication. You really need to have developers talking to one another and sharing screens so they can properly explain their problems.

    So that was under communication which was the problem.

    Then you can get the over communication, where you get dumped with a ton of stuff that doesn’t matter … often information that should really be in a Google Doc that can be referenced. If you add to the mental overhead of a developer, the developer will have a drop in productivity. So often you just need to keep the technician focused and protected from the business decisions going on around them.

  • Day 71 – Sora !

    For today thought I would just add my first Sora video.

  • Day 63 – Are you seeing it yet?

    The generative industry is moving so fast; with images, video, music, speech, written word all now improving at a rapid place.

    Many people are not seeing it.

    Today I was able to build a web tool that may not be great for production but is really quite good for the debugging and investigation of an API. The ‘AI’ was able to understand some of my complex but no quite so explanatory prompts in a way that blew me away and was able to produce some really valuable work.

    I don’t fully know whether programming is going to die as a profession. I think *eventually* yes, it probably will be abstracted away much like assembly language has been. All the patterns will be fully understood, trained, tweaked and they will be able to replicate huge swaths of code that actually create a great product.

    When you do get into complex systems though – whilst I’m no silicon valley person – it does seem that there are limitations on it, and it must have an experienced human (developer) guiding it to get it to the last 5% of properly working.

    That’s just programming. The rest of the generative industry is equally as mind blowing.

    Are people really seeing what’s going on?

    And do they understand the magnitude?

    And are they embracing?

    Work Diary

    Laravel Herd & Filament

    Whilst I have been prototyping with Vanilla, I checked out Filament last night on Laracasts:

    Filament Forms

    The Form library alone is complete enough to warrant using Filament. It has so many small quality of life features that add up. From memory here’s a few that convinced me:

    • Really nice prefix and suffix options
    • Really nice icon and description and helper text integrations
    • Excellent search integration into long selects in one line of code
    • A really easy wizard workflow setup
    • Excellent integration of new entity creation when a user needs to be able to add a new option to a select list
    • Easy linking between SELECTS

    This is of course on top of it working solidly as you would expect.

    So, I’m flipping my prototype to use Filament … which is an excellent basis going forward.

    Filament File Uploads

    In one line, its got a drag and drop file upload, with image preview built in. Uses filecon JS

  • Day 62 – Laravel + Frontend Tooling + Random Thoughts

    Today, I was looking at making a final decision on whether to use a React or Vue frontend. My prototype has been coming along, but I had just been using plain PHP blade templates with Vanilla Javascript for quickness.

    I think web development has got totally overcomplicated, and that you can go a long way with vanilla stuff. I would never start a PHP project without Laravel however, since I know it has virtually everything I need to build whatever I want.

    Since LLMs have come about, there’s a lot to be said for just writing functionality in javascript; but ultimately you do want to benefit from a frontend framework. State management, DOM manipulation and an event bus are reasons why I want to use them. And of course, components. Was a shame the official WebComponent standard is so poor.

    React Or Vue? Or Livewire?

    The age old debate for a developer. It’s annoying quite frankly at this point to continually decide what to use. I prefer the simplicity of Vue syntax, but I often use graduates from a bootcamp that teaches React, and I have bought quite a few premium templates that React also. To be honest though, with LLMs it’s so easy to switch between.

    Then there’s LiveWire. Each time I’ve tried to get into LiveWire I’ve been put off by it. But I had the same resistance when I got into Flutter and once I persevered I loved it.

    In order to decide on the choice, I looked into Filament.

    Filament

    Filament is TALL stack (Tailwind, Alpine, Laravel, Livewire) … and i’ve used previous versions being fairly impressed. This would be a bit of a few steps back in order to take a leap forward, but taking a look at the excellent Laracasts for it, I’m starting to see the benefits.

    • The resource creation gives the CRUD UI upfront … this is slightly different from the CRUD UI generation that I have been working on… since my ones are more ‘on-the-fly’ than ones that get defined in the yaml files. Still the resource creation results in a full on CRUD interface with sortable datatables out of the box.
    • The form library is pretty much flawless. No more messing about with forms.
    • Select options can be configured with enums and gets all enforced really nicely.
    • Multi-tenant stuff out of the box
    • Can hook up with Laravel Stripe for subscriptions

    Laravel Herd

    • I’ve been a fan of Laravel Sail for a while, but that’s only because it abstracted complication away from me. Whilst I get tired of learning new things, that’s no reason to not embrace something that works even better. Enter Laravel Herd. Some benefits:
      • I don’t have to use Docker Desktop. Which decides to hog a huge amount of system memory. You can change this in settings, but overall Herd is a lot faster.
      • For some reason, my local composer would not install the laravel CLI tool to my local machine. So I had to resort to installing Laravel via some shell scripts. Herd got it installed immediately
      • Herd includes TinkerWell, some debugging tools, and general system configuration

    Consultancy Fees

    I’m getting tired of being asked to ‘look into stuff’ and expecting it to be done for free. Need to have a consultancy page hooked up to a simple credit card thing.

  • Day 61 – Prototyping

    I’ve been using my very basic prototype to store data of ‘things’ in my life – I use LLMs to setup this data; and then I have a UI that allows me to interact with the data.

    One of the things I’ve been doing with it is keeping track of projects, ideas and tasks … and the problem I started to face was I didn’t like the UI I had used as the index for these data items.

    So last week I’d spotted a nice slider effect that I liked; and today I decided to put time into my own project for once again; and just implement that.

    So now I still have a CRUD interface for my data, but I’ve also integrated the beginnings of a nice swiper interface so that I can swipe through each ‘thing’ one at a time; consider it one at a time; and then move onto the next.

    I made an animated gif to briefly showcase it. Since I’m running out of time today, I’ll leave it here … but this is currently cycling through my blueprints … it’s a good start and going to be a much more enjoyable way of cycling through the things going on.

    And of course eventually the images will be generated according to the content.

    More updates tommorow.